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FABNSSn DISPR0SPBA7S 



(54) Title: ISOLATION AND EXPRESSION OF FARNESENE SYNTHASE FROM PEPPERMINT, MENTHA X PIPERITA, L. 
(57) Al>stract 

A cDNA encoding 
f£)-j(?'-famesene synthase from 
peppemiint (Mentha piperita) has 
been isolated and sequenced, and 
the coiTcsponding amino acid 
sequence has been determined. 
Accordingly, an isolated DNA 
sequence (SEQ ID NO: I) is 
provided which codes for the 
expression of f£>-J?-famesene 
synthase (SEQ ID NO:2), from 
peppermint (Mentha piperita). 
In other aspects, replicable 
recombinant cloning vehicles 
are provided which code for 
{£)-^-farnesene synthase, or 
for base sequence sufficiently 
complementaiy to at least a 
portion of f£^)-/?-famesene 
synthase DNA or RNA to enable 
hybridization therewith. In yet 
other aspects, modified host 
cells are provided that have 
been transformed, iransfected, 
infected and/or injected with a 
recombinant cloning vehicle 
and/or DNA sequence encoding 
f£)-/?-famesene synthase. Thus, 

systems and methods are provided for the recombinant expression of the aforementioned recombinant |'£)-j&-faniesene synthase that may 
be used to facilitate its production, isolation and purification in significant amounts. Recombinant fE)-/?-famesene synthase may be used 
to obtain expression or enhanced expression of f£)-^-farnesene synthase in plants in order to enhance the production of C£)-")S-famcs^e. 
or may be otfierwise employed for the regulation or expression of f£)-jS-fanicscnc synthase, or the production of its product. 
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ISOLATION AND EXPRESSION OF FARNESENE SYNTHASE FROM PEPPERMINT, MENTHA X PIPERITA. L 

This invention was supported in part by NIH grant number GM-31354 and by 
5 Hatch Project grant number 0268 from the Agricultural Research Center, Washington 
State University. The government has certain rights in the invention. 

Field of the Invention 
The present invention relates to nucleic acid sequences wWch code for (E)-^- 
famesene synthases, such as the f^-P-femesene synthase from Mentha piperita, and 
10 to vectors containing the sequences, host cells containing the sequences and methods 
of producing recombinant (3^-f3*famesene synthases and their mutants. 

Bad^ound of the Invention 
(!iE;^*|}-&mesene (FIGURE 1) is an acyclic sesquiterpene olefin that occurs in a 
wide range of both plant and animal taxa. Over 600 papers have been published on 
15 the occurrence of this natural product and its deployment as an important courier in 
chemical communication. The olefin is found in the essential oil of hundreds of 
species of both gymnosperms, such as Torreya taxifolia (Florida torreya) (Shu, C. K,, 
I^wrence, M. and Croom, E. M., Jr. (1995) J. Essent Oil Res. 7, 71-72) and 
Larix leptolepis (larch) (Nabeta, K., Ara, Y., Aoki, Y, and Miyake, M. (1990) /. Nat. 
20 ProdL 53, 1241-1248)^ and angiosperms, such as Robinia pseudoacacia (black locust) 
(Kamden, D. P., Gruber, K., Barkman, L. and Gage, D. A. (1994) X Essent Oil Res. 
6^ 199-200), Medicago sativa (alfalfa) (Kamm, J. A. and Buttery, K G. (1983) 
Entomol Exp, AppL 33, 129-134), ChamomilJa recutita (chamomile) (Matos, 
P, J. A., Machiado, M. L L., Alencar, J. W. and Craveiro, A. A, (1993) J. Essent Oil 
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Res. 5, 337-33 9X Vitis vini/era (grapes) (Buchbauer, G., Jirovetz, L., Wasicky, M. 
and Nikifbrov, A. (1994) J- Essent Oil Res. 6, 311-314), Cannabis sadva (hemp) 
(Lemberkovics, E., Veszld, P., Verzar-Petri, G. and Trka, A. (1981) Sci. Pharm. 49, 
401-408), Zea mays (com) (Tiirlinigs, T. C X, Tumlinson, J. H,, Heath, KIL, 
5 Proveaxjx; A. T, and Doolittle, R, E, (1991) J. Chem, EcoL 17, 2235-2251), Piper 
nigrum (black pepper), Daucus caroia (carrot), and Mentha x piperita (peppermint) 
(Lawrence, B. M. {1912) Ann. Acad Bras. Cienc, 44, (suppL). 191-197). 

While socially dominant male mice produce both a-&mesene and (EJ-f^-^ 
famesene in their urine as pheromones (Novotny, M., Harvey, S. and Jemiolo, B. 

10 (1990) Experientia 46, 109-1 13)^ it is in the insects and plants that the use of (E)-^- 
famesene as a semiochemical is most extensive. (^^-p-Famesene is emitted by the 
Dufour's gland of andrenid bees (Femandes, A., Duffield, R. M., Wheeler, J. W. and 
LaBerge, W. E. (1981) J, Chem. Ecol 7, 453-460) and by several genera of ants (Aii, 
M. F., Morgan, E. D., Attygalle, A B. and Billen, J, P. J, (1987) Z Naturforsch, 42, 

15 955-^960; Jackson, B. D,, Morgan, E, D, and Billen. I. V. X (1990) Naturwiss. 77, 
187-188; Ollet, D. &, Morgan, E, D., Attygalle, A B, and Billen, J. P. J. (1987) Z 
Naturforsch. 42, 141-146), where it serves both as a defensive allomone and as a trail 
pheromone. This sesquiterpene is synthesized de novo in the osmetrial glands of 
larval Papilio (Lepidoptera:Papi]ion]dae) as an allomone (Honda, K. (1990) Insect 

20 Biochem, 20, 245-250), and it functions as a feeding stimulant to the sand fly 
Lutzomyia longipalpis (Diptera:Psychodidae), an important vector of the blood 
disease leishmaniasis (Tesh, R. B., Guzman, H. and Wilson, M. (1992) J. Med 
EntomoL 29, 226-231). Several species of predatory carabid beetles use jE-P- 
famesene as a prey-finding kairomone (Kielty, J. P., AMen-Williams, L. 

25 Underwood, N. and Eastwood, E. A (1996) J. Insect Behav. 9, 237-250), When 
released by com, this olefin is also a kairomonal oviposition stimulant to the European 
com borer {Ostrinia) (Binder, B. F., Robbins, J, and Wilson, R. L. (1995) J. 
Chem. EcoL 21, 1315-1327). (^-P-famesene is the major component of pollen odor 
in Lupinus and stimulates pollination behavior in bumblebees (Dobson, H, E. M,, 

30 Groth, L and Bergstroem, G. (1996) Am. J. Bot 83, 877-885). Feeding by larval 
lepidopterans, such as Heliothis or Spodoptera (Noctuidae), increases the amount of 
^^-p-famesene released by com; the volatile olefin is then detected as a synomone by 
the parasitic wasp Cotesia marginiventris (Hymenoptera:Bracomdae) for locating the 
iepidopteran hosts (Turiings, T, C, J., Tumlinson, J, H., Heath, R. R., Proveaux, A. T, 

35 and Doolittle, R, E. (1991) J. Chem. Ecol 17, 2235-2251). Circumstantial evidence 
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also suggests the lepidopteran induced production and emission of (E)^(i^fym&s&ie 
from com serves as a synomone for Coiesia kariyai (Takabayashi, J., Takahashi^ S*, 
Dicke, M. and Posthumus, M. A. (1995) J. Chem. Ecol 21, 273-287) and from 
cotton leaves as a synomone for C marginiventris (Pare, P. W. and Tumlinson, J. H. 
5 (1997) Nature 385, 30-31; Loughrin, J.H., Manukian, A., Heath, Turlings, 
T, J. and TumKnson, J, H, (1994) Pra^:. Natl Acad, Sci. USA 91, 11836-11840), 

Perhaps of greatest significance in plant-insect interactions is the use of £-P- 
famesene by most aphid species as an alarm pheromone (Bowers^ W, S,, Nault, R., 
Webb, R. E. and Dutky, (1972) Science 177, 1121*1122; Edwards, L. J., 

10 Siddall, J. B., Dunham, L. L., Uden, P. and Kislow, C. J. (1973) Nature 241, 126- 
127). Aphids exposed to (EJ-^-famesene become agitated and disperse from their 
host plant (Wohlers, P, (1981) Z Angew. Entomol 92, 329-336). Alate aphids are 
usually more sensitive than are apterae species and will often not colonize a host 
displaying (^-P-&niesene. Ants that defend aphids are sensitive to host-emitted (E)- 

IS p-famesene and, when ^posed, will display aggressive bdia^or (l^ault, L. R. and 
Montgomery, (1976) Science 192, 1349-1351). (:^-p-famesene also mimics 

the action of juvenile hormone HI in some insects (Mauchamp, B. and Pickett, L J. 
(1987) Agronomie 7, 523-529), may play a role in control of aphid morphological 
types, and is acutely toxic to aphids at a dose of 100 ng/aphid (van Oosten, A. M,, 

20 Gut, J., Harrewijn, and Piron, P. G. (1990) Acta PhytopathoL Entomol Hung. 
25, 331-342). (a^P-p-famesene vapor is also toxic to whitefiies (Klijnstra, K. W., 
Corts, K. A. and van Oosten, A. M. (1992) Meded. Fac. Landbouwwet 57, 485-^91). 

Efforts to control aphid behavior by topical application of (^-P-famesene to 
crops have met with little success, due to volatility and rapid oxidative inactivation in 

25 air (Dawson, G. W., Griffiths, D. C, Pickett, J. A., Plumb, R. T., Woodcock, C. M. 
and Zhang, Z. N. (1988) Pest Sci. 22, 17-30). Derivatives of (E>P-femesene with 
reduced volatility, or increased stability, have shown promise in redudxig aphid- 
transmitted viruses, such as barley mosaic virus (Dawson, G. W., Griffiths, D. C, 
Pickett, J. A., Plumb, R. T„ Woodcock, C. M, and Zhang, Z. N. (1988) Pest. Set 22, 

30 17-30), potato virus Y (Gibson, R. W., Pickett, X A, Dawson, G. W,, Rice, A. D. 
and Stribley, M. F. (1984) Ann. Appl Entomol 104, 203-209), and beet mosaic virus 
(Gibson, R. W., Pickett, J, A., Dawson, G. W,, Rice, A. D. and Stribley, M. F. (1984) 
Ann, Appl Entomol 104, 203-209). The wild potato Solanum berthaultii, which 
produces ^-p-famesene in type A trichomes, is more repellent to the green peach 

35 aphid than are commercial varieties of & tuberosum that produce lower leveis of the 
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olefin (Gibson, R. W. and Pickett, J. A. (1983) Nature 302, 608-609; Ave, A., 
Gregory, P. and Tingey, W. M. (1987) Entomol Exp. App. 44, 131-138). In alfalfa, 
repellency to the blue alfalfa ^hid and the pea aphid is correlated with the leaf 
content of (!]^-3-famesene, but not with the amount of the co-occurring sesquiterpene 
5 caryophyllene (Mostafavi R., Hemiing, J. A,, Gardea-Torresday, J. and Ray, L M. 
(1996)^ Chem. Ecol 22, 1629-1638). 

For plants that produce (!E!)-p-famesene, breeding for increased production 
has met with some success (Mostafavi, R., Henning, J. A., Gardea-Torresday, J. and 
Ray, L M. (1996) J, Chem. Ecol 22, 1629-1638), but has been limited by genetic 

10 variation in these species, (35J-3-famesene synthase has been purified from maritime 
pine (Pinus pinaster) and characterized (Salin, F., Pauly, G., Charon, J. and Gleizes, 
M. (1995) X Plant Phys, 146, 203-209), but the gene has not yet been isolated from 
any source. A cDNA clone for /!!^-3-famesene synthase would, by transgenic 
manipulation, provide a valuable addition to the arsenal of natural compounds active 

15 in host plant resistance. The substrate for (25^-P-famesene. synthase is famesjd 
diphosphate, a ubiquitous isoprenoid intermediate involved in cytoplasmic phytost»ol 
biosynthe^s. Sesquiterpene synthases lack plastidial targeting sequences and are 
localized to the cytoplasm (Chappell, J. (1995) Annu, Rev, Plant Physiol Plant MoL 
Biol 46, 521-547). Therefore, even in plants that do not normally produce 

20 sesquiterpenes^ a recombinant (E)^^^&cm&sen^ synthase would be directed to the 
cytoplasm where substrate is supplied by the mevalonate pathway and where 
production of (^-p-famesene should result. 

Summary of the Invention 
In accordance with the foregoing, a cDNA encoding (^-P-famesene synthase 

25 from peppermint (Mentha piperita) has been isolated and sequenced, and the 
corresponding amino acid sequence has been deduced. Accordingly, the present 
invention relates to isolated DNA sequences which code for the expression of (E)^^ 
famesene synthase, such as the sequence designated SEQ ID NO:l which encodes an 
(I^-P-famesene synthase protein (SEQ ID NO:2) from peppermint (Menthi piperita), 

30 Additionally, the present invention relates to isolated, recombinant (!£^-p-faniesene 
synthase proteins from peppermint (Mentha piperita). In other aspects, the present 
invention is directed to replicable recombinant cloning vehicles comprising a nucleic 
add sequmce, e,g., a DNA sequence which codes for an (£^>|3-&me5ene synthase, or 
for a base sequence suffidently complementary to at least a portion of DNA or BNA 

35 encoding ^-p-&mesene synthase to enable hybridization therewith (e.g., antisense 
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RNA or firagments of DNA complementaiy to a portion of DNA or RNA molecules 
encoding ^-P^famesene synthase which are use&l as polymerase chain reaction 
primers or as probes for (!^-3-famesene synthase or related genes). In yet other 
aspects of the invention, modified host cells are provided that have been transformed, 

5 transfected^ infected and/or injected with a recombinant cloning vehicle and/or DNA 
sequence of the invention. Thus, the present invention provides for the recombinant 
expression of (E^-P-femesene synthase, and the inventive concepts may be used to 
facilitate the production, isolation and purification of significant quantities of 
recombinant (^-p-famesene synthase (or of its primary enzyme products) for 

10 subsequent use, to obtain expression or enhanced expression of ^*p-faraesene 
synthase in plants, microorganisms or animals, or may be otherwise employed in an 
environment where the regulation or mpression of (^-P-famesene synthase is desired 
for the production of this synthase, or its enzyme product, or derivatives thereof 

Brief Description of the Drawing s 

IS The foregoing aspects and many of the attendant advantages of this invention 

will become more readily appreciated as the same becomes better understood by 
reference to the following detailed description, when taken in conjunction with the 
accompanying drawings, wherein: 

FIGURE 1. The sesquiterpene synthase substrate, famesyl diphosphate, and 

20 sesquiterpene olefins found in peppermint essential oil. 

FIGURE 2. Radio-GC of the sesquiterpene olefins generated firom 
[l-^H]famesyl diphosphate by an enzyme preparation from peppemiint oil gland 
secretory cells. The olefin fraction of steam-distilled peppermint oil was used as 
internal standard, and only the portion of the chromatogram containing the 

25 sesquiterpene olefins is shown. 

FIGURE 3 A GC-MS of the products generated from &niesyl diphosphate by 
the recombinant (E>-p-fem«ene synthase. Panel A: Total ion chromatogram. 
Numbered peaks are sesquiterpene oldSns. 

FIGURE 3B. Mass spectrum and retention time of peak 1 designated in 

30 FIGURE 3 A 

FIGURE SC. Mass spectram and retention time of authentic (:2^-P-famesene 
fi-om parley oil 

FIGURE 3D. Mass spectrum and retention time of peak 6 designated in 
FIGURE 3 A The spectrum of this minor product is compromised by the low ion 
35 abundance and the corresponding prominence of background ions. 
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FIGURE 3E. Mass spectrum and retention time of authentic 5-cadinene. 
FIGURE 4. Proposed mechanism for the formation of ^-p-famesene and 
5-cadinene from famesyl diphosphate. OPP denotes the diphosphate moiety. 
Ionization of the enzyme-bound nerolidyl diphosphate intermediate and proton 
S eUmination can also produce f^-P-famesene, 

FIGURE 5. Monoterpene olefins generated from the alternate substrate 
geranyl diphosphate by recombinant ^35>-P"famesene synthase. 

Detailed Description of the Preferred Embodiment 
As used herein, the terms "amino acid" and "amino acids" refer to all naturally 
10 occurring L-a-amino acids or their residues. The amino acids are identified by either 
the single-letter or three-letter designations: 



Asp 


D 


aspartic acid 


lie 


I 


isoleucine 


Thr 


T 


threonine 


Leu 


L 


leucine 


Ser 


S 


serine 


Tyr 


Y 


tyrosine 


Glu 


E 


glutamic acid 


Phe 


F 


phenylalanine 


Pro 


P 


proline 


His 


H 


histidine 


Gly 


G 


glycine 


Jjys 


K 


iyane 


Ala 


A 


alanine 


Arg 


R 


ar^nine 


Cys 


C 


cysteine 


Trp 


W 


tryptophan 


Val 


V 


valine 


Gin 


Q 


glutamine 


Met 


M 


methionine 


Asn 


N 


asparagine 



As used herein, the term "nucleotide" means a monomeric unit of DNA or 
KNA containing a sugar moiety (pentose), a phosphate and a nitrogenous heterocyclic 
base. The base is linked to the sugar moiety via the glycosidic carbon (r carbon of 

25 pentose) and that combination of base and sugar is called a nucleoside. The base 
characterizes the nucleotide with the four bases of DNA being adenine ("A'*), guanine 
("G"X cytosine ("C") and thymine ("T"). Inosine ("I") is a synthetic base that can be 
used to substitute for any of the four, naturally-occurring bases (A, C, G or T). The 
four RNA bases are A^G^C and uracil ("U"). The nucleotide sequences d^cribed 

30 herein comprise a linear array of nucleotides connected by phosphodiester bonds 
between the 3' and 5' carbons of adjacent pentoses. 

"Oligonucleotide" refers to short length single or double stranded sequences of 
deoxyribonucleolides linked via phosphodiester bonds. The oligonucleotides are 
chemically synthesized by known m^hods and purified^ for example, on 

35 polyacrylamide gels. 
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The term "^)-P"famesene synthase" refers to an enzyme that is capable of 
converting famesyl diphosphate to (E)*P-famesene. 

The term "essential oil plant," or "essential oil plants," refers to a group of 
plant species that produce high levels of monoterpenoid and/or sesquiterpenoid and/or 
5 diterpenoid oils, and/or iiigh levels of monoterpenoid and/or sesquiterpenoid and/or 
diterpenoid resins. The foregoing oils and/or resins account for greater than about 
0.005% of the fresh weight of an essential oil plant that produces them. The essential 
oils and/or resins are more &lly described, for example, in E. Guenther, The Essential 
Oils, Vols, I-VI, R.E. Krieger Publishing Co., Huntington N.Y,, 1975, incorporated 
10 her^ by reference. The essential oil plants include, but are not limited to: 

Lamiaceae, including, but not linuted to, the following species: Ocimum 
(basil), Lavandula (Lavender), Origanum (oregano), Mentha (mint). Salvia (sage), 
Rosmecinus (rosemary). Thymus (thyme), Satureja and Monarda. 

Umbelliferae, including, but not limited to, the following species: Carum 
15 (caraway), Anethum (dill), feniculum (fennel) and Daucus (carrot), 

Asteraceae (Compositae), including, but not limited to, the following species: 
Artemisia (tarragon, sage brush), Tanacetum (tansy). 

Hutaceae (e.g., citrus plants); Rosaceae (e.g., roses); Myrtaceae (e.g., 
eucalyptus, Melaleuca); the Gramineae (e.g., Cymbopogon (citronella)); Geranaceae 
20 (Geranium) and certain conifers including Abies (e.g., Canadian balsam), Cedrus 
(cedar) and Thuja and Juniperus. 

The range of essential oil plants is more fiilly set forth in K Guenther, The 
Essential Oils, Vols, I-VI, RE. Krieger Puhlishing Co,, Huntington K¥., J975, 
which is incorporated herein by reference. 
25 The term "angiosperm" refers to a class of plants that produce seeds that are 

enclosed in an ovary. 

The term "gymnosperm" refers to a class of plants that produce seeds that are 
not enclosed in an ovaiy. 

Abbreviations used are: bp, base p^; dpm, disintegrations per minute; DTT, 
30 dithiothreitol; EDTA, ethylenediaminetetraacetic acid; FDP, famesyl diphosphate; 
GrC, gas chromatography; GDP, geranyl diphosphate; GGDP, geranylgeranyl 
diphosphate; I, identity; IPTG, isopropyl-p-D-thiogalactopyranoside; LB, Luria- 
Bertani; Mopso, 3-(iV^morpholino)-2-hydroxypropane-sulfonic acid; MS, mass 
spectrometry; PVPP, polyvinylpolypyrrolidone; S, similarity. 
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The term "percent identity" (%I) means the percentage of amino acids or 
nucleotides that occupy the same relative position when two amino add sequences, or 
two nucleic acid sequences, are aligned side by ^de. 

The term "percent similarity" (%S) is a statistical measure of the degree of 
5 relatedness of two compared protein sequences. The percent simiiaiity is calculated 
by a computer program that assigns a numerical value to each compared pair of amino 
acids based on chemical similarity (e.g:, whether the compared amino acids are acidic, 
basic, hydrophobic, aromatic, etc.) and/or evolutionary distance as measured by the 
minimum number of base pair changes that would be required to convert a codon 
10 encoding one member of a pair of compared amino acids to a codon encoding the 
other member of the pair. Calculations are made after a best fit alignment of the two 
sequences has been made empirically by iterative comparison of all possible 
alignments. (HenikofF, S. and Henikoflf, Proc. Nat'l Acad Sci USA 89: 

10915^10919, 1992). 

15 The abbreviation "SSC" refers to a buffer used in nucleic acid hybridization 

sohitions. One liter of the 20X (twenty times concentrate) stock SSC buffer solution 
(pH 7,0) contains 175.3 g sodium chloride and 88.2 g sodium citrate. 

The terms "alteration**, "amino acid sequence alteration"^ "variant" and "amino 
acid sequence variant" refer to (E)-P-famesene synthase molecules with some 

20 differences in their amino acid sequences as compared to the corresponding^ native, 
i.e., naturally-occurring, (E)-13-famesene synthases. Ordinarily, the variants will 
possess at least about 70% homology with the corresponding native (E)-P-&me$ene 
synthases, and preferably, they wil! be at least about 80% homologous with the 
corresponding, native (E)-p-famesene synthases. The amino acid sequence variants 

25 of the (E)-P-faniesene synthases falling within this invention possess substitutions, 
deletions, and/or insertions at certain positions. Sequence variants of (E)-P-famesene 
synthases may be used to attain desired enhanced or reduced en^miatic activity, 
modified regiochemistry or stereochemistry, or altered substrate utilization or product 
distribution. 

30 Substitutional (E)-p-&mesene synthase variants are those that have at least 

one amino acid residue in the native (B)-p-famesene synthase sequence removed and 
a dififerent amino add inserted in its place at the same position. The substitutions may 
be single, where only one amino acid in the molecule has been substituted, or they 
may be multiple, where two or more amino acids have been substituted in the same 

35 molecule. Substantial changes in the activity of the (E)-|}-&me8ene synthase 
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molecules of the present invention may be obtained by substituting an amino add with 
a side chain that is significantly different in charge and/or structure from that of the 
native amino acid. This type of substitution would be expected to affect the structure 
of the polypeptide backbone and/or the charge or hydrophobicity of the molecule in 
5 the area of the substitution. 

Moderate changes in the activity of the (E)-p-famesene synthase molecules of 
the present invention would be expected by substituting an amino acid with a side 
chain that is similar in charge and/or structure to that of the native molecule. This 
type of substitution, referred to as a conservative substitution, would not be expected 
10 to substantially alter either the structure of the polypeptide backbone or the charge or 
hydrophobicity of the molecule in the area of the substitution. 

Insertional (E)-p-famesene synthase variants are those with one or more 
amino acids inserted immediately adjacent to an amino acid at a particular position in 
the native (E)-p-famesene synthase molecule. Immediately adjacent to an amino add 
15 means connected to either the a-carboxy or a-amino fianctional group of the andno 
add- The insertion may be one or more amino adck. OrcUnarily, the insertion will 
consist of one or two conservative amino acids, Amino acids similar in charge and/or 
stmcture to the amino acids adjacent to the site of insertion are defined as 
conservative. Alternatively, this invention includes insertion of an amino acid with a 
20 charge and/or structure that is substantially different from the amino adds adjacent to 
the site of insertion. 

Deletional variants are those where one or more amino acids in the native (E)- 
p-famesene synthase molecules have been removed. Ordinarily, deletional variants 
have one or two amino acids deleted m a particular region of the (E)-p-famesene 
25 synthase molecule. 

The terais "biological activity", "biologically active", "activity" and "active" 
rrfer to the ability of the (E)-P-famesene synthases of the pres^ invention to 
catalyze the formation of (E>P-faraesene fi-om famesyl <fiphosphate, (E>P-famesene 
^thase aaivity is measured in an en2yme activity assay, such as the assay described 
30 in Example 1 herein. Amino acid sequence variants of the (E)-P-famesene synthases 
of the present uxvention may have desirable altered biological activity including, for 
acample, altered reaction kinetics, substrate utilization, product distribution or other 
characteristics such as regiochemistry and stereochemistry. 

The terms "DNA sequence encoding", "DNA encoding" and "nucleic acid 
35 encoding" refer to the order or sequence of deosiyribonucleotides along a strand of 
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deoxyribonucleic acid. The order of these deoxyribonucleotides determines the order 
of amino acids along the translated polypeptide chain. The DNA sequence thus codes 
for the amino acid sequence. 

The terms "replicable expression vector" and **expression vector" refer to a 
5 piece of DNA, usually double-stranded» wtuch may have inserted into it another piece 
of DNA (the insert DNA) such' as, but not limited to, a cDNA molecule. The vector 
is used to transport the insert DNA into a suitable host cell The insert DNA may be 
derived from the host cell, or may be derived from a different cell or organism. Once 
in the host cell, the vector can replicate independently of or coinddental with the host 

10 chromosomal DNA, and several copies of the vector and its inserted DNA may be 
generated. In addition, the vector contains the necessary elements that permit 
translating the insert DNA into a polypeptide. Many molecules of the polypeptide 
encoded by the insert DNA can thus be rapidly synthesized. 

The terms *'transfonned host cell," "transformed" and "transformation" refer to 

15 the introduction of DNA into a cell. The cell is termed a "host cell", and it may be a 
prokaiyotic or a eukaryotic celL Typical prokaryotic host cells include various strains 
of £. co/j. Typical eukaryotic host cells are plant cells, such m maize cells, yeast 
celi&, insect cells or animal cells. The introduced DNA is usually in the form of a 
vector contaimng an inserted piece of DNA. The introduced DNA sequence may be 

20 from the same species as the host cell or from a different species from the host cell, or 
it may be a hybrid DNA sequence, containing some foreign DNA and some DNA 
derived from the host species. 

In accordance with the present invention, a cDNA (SEQ ID NO:l) encoding 
(E)-P-famesene synthase (SEQ ID NO:2) from peppermint (Mentha piperita) was 

25 isolated and sequenced in the following manner An enriched cDNA library was 
constructed from peppermint secretory cell clusters consisting of the eight glandular 
cells subtending the oil droplet. These cell clusters were harvested by leaf surface 
abrasion and the KNA contained therein was isolated. mRNA was purified by oligo- 
dT cellulose chromatography, and 5 |ag of mRNA was used to construct a ^APII 

30 cDNA library. 

Pksmids were excised from the library en mass and used to transform E. coli 
strain XLOLR. Approximately ISO individual plasmid-bearing strains were grown in 
5 ml LB media overnight, and the corresponding plasmids were purifi^ before partial 
5 -sequencing. Putative terpenoid synthase genes were identified by sequence 

35 comparison usmg the BLAST program of the GCG Wiscon^n Package ver, 8. 
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Bluescript plasmids harboring unique full-lengtfa cDNA inserts with high similarity to 
known plant terpenoid synthases were tested for functional expression following 
transformation into K coli XLl-Blue cells. A single extract, from the bacteria 
containing clone p43, including the cDNA insert set forth in SEQ ID NO: 1^ produced 
5 a sesquiterpene olefin from [l-^HJEDP, and this clone was selected for fiirtfaer study. 

A cell-free extract of E. coli XL-1 Blue cells harboring the plasmid p43, 
including the cDNA insert set forth in SEQ ID NO:l, was prepared and shown to be 
capable of catalyzing the divalent metal ion-dependent conversion of [l-^HjEDP to 
labeled sesquiterpene olefins. Control reactions, employing extracts of XLl*Blue 
10 cells transformed with pBIuescript laddng the insert, evidraced no detectable 
production of sesquiterpene olefins from [l-^EQEDP, thereby demonstrating that a 
cDNA clone (SEQ ID N0:1) encoding (E^-P-fwnesene synthase (SEQ ED NO:2) had 
been acquired. 

The recombinant (£^^f3-£amesene synthase (SEQ ID NO:2) was inactive with 

IS the C20 substrate analog [I'-^HJGGDP, but was able to catalyze the divalent cation* 
dependent conversion of the Cjq analog [l-^ETjCDP to monoterpene olefins. Control 
reactions, employing extracts of XL 1 -Blue cells transformed with pBIuescript lacking 
the insert, evidenced no detectable production of monoterpene olefins from 
[l-^H]GDPj thereby confirming that the monoterpene synthase activity expressed 

20 from the cDNA insert of p43 (SEQ ID NO:l) was a function of the (35^-P-famesene 
synthase (SEQ ID NO:2). This is the first report describing the utilization of GDP by 
a sesquiterpene synthase. 

Complete sequencing of the (^-^-famesene synthase cDNA (SEQ ID NO:l) 
contained in p43 revealed an insert size of 19S9 bp encoding an open reading frame of 

2S 550 amino acids with a deduced molecular weight of 63,829, The deduced amino 
acid sequence of the ^-p-famesene ^nthase (SEQ ID NO:2) lacks a plastidtal 
targeting peptide. Like all other known terpenoid synthases, (^-p-fiimesene synthase 
is rich in tryptophan (1.8%) and alanine (5.5%) residues, and bears a DDXXD motif 
(SEQ ID NO:3) (residues 301-305 of SEQ ID NO:2) which is believed to coordinate 

30 the divalent metal ton chelated to the substrate diphosphate group. The enzyme has a 
deduced isoelectric point at pH 5.16, 

The isolation of a cDNA (SEQ ED NO:l) encoding (£p-j3-famesene synthase 
(SEQ ID N0:2) permits the development of efBcient expression systems for this 
functional enzyme; provides usefiil tools for examining the developmental regulation 

35 of (3^-p-*famesene synthase; permits investigation of the reaction mechanism(s) of this 
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en2yme, and permits the isolation of other (I^-p-famesene synthases. The isolation of 
an fZ^-P-famesene synthase cDNA (SEQ ID NO: 1) also permits the transformation of 
a Avide range of organisms in order to enhance, enable or otherwise alter, the synthesis 
of fl^-P-famesene. 

5 Although the (S^-P-famesene synthase protein set forth in SEQ ID N0:2 lacks 

a plaslidial targeting sequence, a targeting sequence from another protein can be 
included in the (E^-P-famesene synthase amino terminus* Transport sequences well 
known in the art (See, for example, the following publications, the cited portions of 
which are incorporated by reference herein: vonHeijne etal., Eur. J, Biochem.^ 

10 180:535-545, 1989; Stryer, Biochemistry, W,H. Freeman and Company, New York, 
NY, p. 769 [1988]) may be employed to direct (^^-p-famesene synthase to other 
cellular or extracellular locations. 

In addition to the native fSy-p-famesene synthase amino acid sequence of 
SEQ ED N0;2, sequence variants produced by deletions, substitutions, mutations 

15 and/or insertions are intended to be within the scope of the invention exx:ept inso&r as 
limited by the prior art. The (i^-P-famesene synthase amino add sequence variants of 
this invention may be constructed by mutating the DNA sequences that encode the 
wild-type synthases, such as by using techniques commonly referred to as site- 
directed mutagenesis. Nucleic acid molecules encoding the ^-P*&mesene synthases 

20 of the present invention can be mutated by a variety of PGR techmques well known to 
one of ordinary skill in the art. (See, for example, the following publications, the dted 
portions of which are incorporated by reference herein: "PGR Strategies", MA, 
Innis, D,H. Gelfand and J J. Sninsky, eds., 1995, Academic Press, San Diego, OA 
(Chapter 14); "PGR Protocols: A Guide to Methods and Applications", MA. Innis, 

25 D,H. Gelfand, J J, Sninsky and TJ, White, eds.. Academic Press, NY (1990), 

By way of non-limiting example, the two primer system utilized in the 
Transformer Site-Directed Mutagenesis kit from Clontech, may be employed for 
introducing site-directed mutants into the (EJ^-p-famesene synthase genes of the 
present invention. Following denaturation of the target plasmid in this system, two 

30 primers are simultaneously annealed to the plasmid; one of these {uimers contains the 
desired site-directed mutation, the other contains a mutation at another point in the 
plasmid resulting in elimination of a unique restriction site. Second strand synthesis is 
then carried out, tightly linking these two mutations, and the resulting plasmids are 
transformed into a mutS strain of E. coli. Plasmid DNA is isolated from the 

35 transformed bacteria, restricted with the relevant restriction enzjpme (thereby 
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Uneanzing the unmutated plasmids), and then retransfonned into E. colt This system 
allows for generation of mutations directly in an expression plasmid, without the 
necessity of subcloning or generation of single-stranded phagemids. The tight linkage 
of the two mutations and the subsequent linearization of unmutated plasmids results in 

5 high mutation efficiency and allows minimal screening. Following ^thesis of the 
initial restriction site primer, this method requires the use of only one new primer type 
per mutation site. Rather than prepare each positional mutant separately, a set of 
"designed degenerate** oligonucleotide primers can be sjmthesized in order to 
introduce all of the desired mutations at a given site simultaneously. Transformants 

10 can be screened by sequencing the plasmid DNA through the mutagenized region to 
identify and sort mutant clones. Each mutant DNA can then be fully sequenced or 
restricted and analyzed by electrophoresis on Mutation Detection Enhancement gel 
(J.T. Baker) to confirm that no other alterations in the sequence have occurred (by 
band shift comparison to the unmutagenized control). 

IS Again, by way of non-limiting example, the two primer system utilized in the 

QuikChange™ Site-Directed Mutagenesis kit from Stratagene (LaJolla, Califonua), 
may be employed for introducing site-directed mutants into the fl^-p-famesene 
synthase genes of the present invention. Double-stranded plasmid DNA, cont^ning 
the insert bearing the target mutation site, is denatured and mixed with two 

20 oligonucleotides complementary to each of the strands of the plasmid DNA at the 
target mutation site. The annealed oligonucleotide primers are extended using PJu 
DNA polymerase, thereby generating a mutated plasmid containing staggered nicks. 
After temperature cycling, the unmutated, parental DNA template is digested with 
restriction enzyme Dpnl which cleaves methylated or hemimethylated DNA, but 

25 which does not cleave unmethyiated DNA. The parental, template DNA is almost 
always methylated or hemimethylated ^ce most strains of Kcoli^ from which the 
template DNA is obtained, contain the required methylase activity. The remaining, 
annealed vector DNA incorporating the desired mutation(s) is transformed into 
£1 colt 

30 The mutated (^-p-famesene synthase gene can be cloned into a p£T (or 

other) overexpression vector that can be employed to transform K coli such as strain 
K coli BL21(DE3)pLysS, for high level production of the mutant protein, and 
purification by standard protocols. Examples of plasmid vectors and E, coli strains 
that can be used to express high levels of the (E>|3-famesene synthase proteins of the 

35 present invention are set forth in Sambrook et al, Molecular Cloning, A Laboratory 
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Manual, 2nd Edition (1989), Chapter 17. The method of FAB-MS mapping can be 
employed to rapidly check the fidelity of mutant repression. This technique provides 
fbr sequencing segments throughout the whole protein and provides the necessary 
confidence in the sequence assignment. In a mapping experiment of this type^ protein 
5 is digested with a protease (the dioice will depend on the specific region to be 
modified since this segment is of prime interest and the remaining map should be 
identical to the map of unmutagenized protein). The set of deavage fragments is 
firactionated by microbore HPLC (reversed phase or ion exchange, again depending 
on the specific region to be modified) to provide several peptides in each fraction, and 

10 the molecular weights of the peptides are determined by FAB-MS. The masses are 
then compared to the molecular weights of peptides expected from the digestion of 
the predicted sequence, and the correctness of the sequence quickly ascertained. 
Since the exemplary mutagenesis techniques set forth herein produce site-directed 
mutations, sequencing of the altered peptide should not be necessary if the mass 

15 spectrograph agrees with prediction. If necessary to verify a changed residue, 
CAD-tandem MS/MS can be employed to sequence the peptides of the mixture in 
question, or the target peptide can be purified for subtractive Edman degradation or 
carboTsypeptidase Y digestion depmding on the location of the modification. 

In the design of a particular site directed mutagenesis experiment, it is 

20 generally desirable to first make a non-conservative substitution (e.g., Ala for C^, His 
or Glu) and determine if activity is greatly impaired as a consequence. The properties 
of the mutagenized protein are then examined with particular attention to the Idnetic 
parameters of Kj^ and k^^t as sensitive indicators of altered function, firom which 
changes in binding and/or catalysis perse may be deduced by comparison to the 

25 native en:qmie. If the residue is by this means demonstrated to be important by 
activity impairment, or knockout, then conservative substitutions can be made, such 
as Asp for Glu to alter side chain length, Ser for Cys, or Arg for His. For 
hydrophobic segments, it is largely size that is usefully altered, although aromatics can 
also be substituted for alkyl side chains. Changes in the normal product distribution 

30 can indicate which step(s) of the reaction sequence have been altered by the mutation. 
Modification of the hydrophobic pocket can be employed to change binding 
conformations for substrates and result in altered regioch^mstry and/or 
stereochmiistiy* 

Other site directed mutagenesis techniques may also be employed with the 
35 nucleotide sequm^es of the invention. For example, restriction endonuclease 
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digestion of DNA followed by ligation may be used to generate deletion variants of 
(:j|^-0-&mesene synthase, as described in section 15.3 of Sambrook et al. Molecular 
Cloning: A Laboratoiy Manual^ 2nd Ed., Cold Spring Harbor Laboratoiy Press^ 
New York, NY [1989], incorporated herein by reference. A similar strategy may be 
5 used to construct insertion variants, as described in section 15.3 of Sambrook et al., 
supra. 

Oligonucleotide-directed mutagenesis may also be employed for preparing 
substitution variants of this invention. It may also be used to conveniently prepare the 
deletion and insertion variants of this invention. This technique is well known in the 

10 art as described by Adelmanetal. (DAM 2:183 [1983]); Sambrook et al., supra; 
"Current Protocols in Molecular Biology", 1991, Wiley (NY), F.T, Ausubd, R. Brent, 
R,E. Kingston, D.D. Moore, J,D, Seidman, lA- Smith and K. Struhl, eds, 
incorporated herein by reference. 

Generally, oligonucleotides of at least 25 nucleotides in length are used to 

15 insert, delete or substitute two or more nucleotides in the ^-p-famesene ^thase 
molecule. An optimal oligonucleotide will have 12 to 15 perfectly matdied 
nucleotides on either ade of tiie nucleotides coding for the mutation. To mutagenize 
wild-type (2y-P-famesene synthase, the oligonucleotide is annealed to the single- 
stranded DNA template molecule under suitable hybridization conditions. A DNA 

20 polymerizing enz5ane, usually the Klenow fragment of £1 coli DNA polymerase I, is 
then added. This enzyme uses the oligonucleotide as a primer to complete the 
synthesis of the mutation-bearing strand of DNA, Thus, a heteroduplex molecule is 
formed such that one strand of DNA encodes the wild-type synthase inserted in the 
vector, and the second strand of DNA encodes the mutated form of the synthase 

25 inserted into the same vector. This heteroduplex molecule is then transformed into a 
suitable host cell. 

Mutants with more than one amino acid substituted may be generated in one 
of several ways. If the amino acids are located dose together in the polypeptide 
chain, they may be mutated simultaneously using one oligonucleotide that codes for 

30 all of the desired amino acid substitutions. If, however, the amino acids are located 
some distance from each other (separated by more than ten amino acids, for example) 
it is more difficult to generate a single oligonucleotide that encodes all of the desired 
changes. Instead, one of two alternative methods may be employed. In the first 
method, a s^arate oligonucleotide is generated for each amino acid to be substituted. 

35 The oligonucleotides are then annealed to the single-stranded template DNA 
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siinuitaneously, and the second strand of DNA that is synthesized from the template 
will encode all of the desired amino add substitutions. An alternative m^od 
involves two or more rounds of mutagenesis to produce the dedred mutant. The first 
round is as described for the single mutants: wild-type f]^-p-famesene synthase DNA 
5 is used for the template, an oligonucleotide encoding the first desired amino add 
substitution(s) is annealed to this template, and the hetmkdupiex: DNA molecule is 
then generated. The second round of mutagenesis utilizes the mutated DNA 
produced in the first round of mutagenesis as the template. Thus, this template 
already contains one or more mutations. The oligonucleotide encoding the additional 

10 desired amino acid substitution(s) is then annealed to this template, and the resulting 
strand of DNA now encodes mutations fi-om both the first and second rounds of 
mutagenesis. This resultant DNA can be used as a template in a third round of 
mutagenesis^ and so on. 

A gene encoding (2y-'P-famesene synthase may be incorporated into any 

15 organism (intact plant, animal, microbe, etc.), or cell culture derived therefi-om, that 
produces substrates that can be converted to ^-P-femesene. An ^-P-famesene 
synthase gene may be introduced into any organism for a variety of purposes 
including, but not limited to: production of f^-P-famesene synthase^ or its product 
(^-p-famesene; enhancement of the rate of production and/or the absolute amount of 

20 (:S^-p-fame5ene; enhancement of protection of plants against pests and pathogens^ for 
example by producing (2^>P-famesene to act as a pollinator attractant synomone for 
predators and parasites of plant pests, or as an aphid alarm pheromone. While the 
nucldc acid molecules of the present invention can be introduced into any organism, 
the nucleic acid molecules of the present invention will preferably be introduced into a 

25 plant species. 

Eukaryotic expression systems may be utilized for the production of (!^-P- 
famesene synthase since they are capable of carrying out any required 
posttranslational modifications and of directing the enzyme to the proper cellular 
compartment. A representative eukaryotic expression system for this purpose uses 

30 the recombinant baculovirus, Autographa calif arnica nudear polyhedrosis virus 
(AcNPV; M.D. Summers and Smithy A Manual of Methods for Baculovirus 
Vectors and Insect Cell Culture Procedures [1986]; Luckow et al,. Bio-technology^ 
6:47-55 [1987]) for expres^on of the (3^-P-&mesene synthases of the invention. 
Infection of insect cells (such as cells of the species Spodoptera firu^perddj with the 

35 recombinant baculoviruses allows for the production of large amounts of the ^*P^ 
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famesene synthase proteins. In addition, the baculovims system has other impoitant 
advantages for the production of recombinant (^-P-famesene synthase. For example^ 
baculoviruses do not infect humans and can therefore be safely handled in large 
quantities. In the baculovirus system, a DNA construct is prepared including a DNA 
5 segment encoding (S^-P-^famesene synthase and a vector. The vector may comprise 
the polyhedron gene promoter region of a baculovirus, the baculovirus flanking 
sequences necessary for proper cross-over during recombination (the flanking 
sequences comprise about 200-300 base pairs adjacent to the promoter sequence) and 
a bacterial origin of replication which permits the construct to replicate in bacteria. 

10 The vector is constructed so that (i) the DNA segment is placed adjacent (or operably 
linked or "downstream" or "under the control of*) to the polyhedron gene promoter 
and (it) the promoter/(j^-p-famesene synthase combination is flanked on both sides 
by 200-300 base pairs of baculovirus DNA (the flanking sequences). 

To produce the f:^-p-famesene synthase DNA construct^ a cDNA clone 

IS encoding the fiiU length (!^-P-faniesene synthase is obteuned using methods such as 
those described herein. The DNA construct is contacted in a host cell with 
baculovirus DNA of an appropriate baculovirus (that is, of the same species of 
baculovirus as the promoter encoded in the construct) under conditions such that 
recombination is effected. The resulting recombinant baculoviruses encode the fiill 

20 (j^-p-famesene synthase. For example, an insect host cell can be cotransfected or 
transfected separately with the DNA construct and a functional baculovims. 
Resulting recombinant baculoviruses can then be isolated and used to infect cells to 
effect production of the j^-P-famesene synthase. Host insect cells include, for 
example, Spodoptera frugiperda cells, that are capable of producing a bacuiovirus- 

25 expressed (j^-P-famesene synthase* Insect host cells infected with a recombinant 
baculovims of the present invention are then cultured under conditions allowing 
expression of the baculovirus-encoded (!^-|3-famesene synthase. fj^-*P-£amesene 
synthase thus produced is then extracted from the cells using methods known in the 
art. 

30 Other eukaryotic microbes such as yeasts may also be used to practice this 

invention. The baker's yeast Saccharomyces cerevisiae, is a commonly used yeast, 
although several other strains are available. The plasmid YRp7 (Sttnchcomb et al,. 
Nature, 282:39 [1979]; Kingsman et al.. Gene 7:141 [1979]; Tschemper et al,, Gene, 
10:157 [1980]) is commonly used as an expression vector in Saccharomyces, This 

35 plasmid contains the trpl gene that provides a selection marker for a mutant strain of 



BNSDOCID: <WO_ 



.99161 iaA1_L> 



wo 99/181 IS PCT/US98/20885 

-18- 



yeast lacking the ability to grow in the absence of tryptophan, such as straiiis ATCC 
No. 44,076 and PEP4-1 (Jones, Genetics, 85:12 [1977]), The presence of the trpl 
lesion as a characteristic of the yeast host cell genome then provides an efifective 
enviromnent for detecting transformation by growth in the absence of tiyptophaiL 
5 Yeast host cells are generally transfbmied using the polyethylene glycol method, as 
described by Hinnen {Proc, Natl Acad Sci. USA, 75:1929 [1978]). Additional yeast 
transformation protocols are set forth in Gietz et al., N.A.R^ 20(17): 1425(1992); 
Reeves et aL, FEMS, 99(2-3):193^197, (1992), both of which publications are 
incorporated herein by reference. 

10 Suitable promoting sequences in yeast vectors include the promoters for 

3 -phosphogly cerate kinase (Hitzeman et aL, J. BioL Chem., 255:2073 [1980]) or 
other glycolytic enzymes OHessetal., J. Adv, Enzyme Reg. 7:149 [1968]; 
Holland etal. Biochemistry, 17:4900 [1978]), such as enolase, gIyceraldehyde-3- 
phosphate dehydrogenase, hexokinase, pyruvate decarboxylase, phosphofinctokinase^ 

15 glucose-6-phosphate isomerase^ 3«phosphoglycerate mutase, pyruvate kinase, 
triosephosphate isomerase, phosphoglucose isomerase, and glucoldnase. In the 
construction of suitable expression plasmids, the termination sequences ai^ociated 
with these genes are also ligated into the expression vector 3' of the sequence desired 
to be expressed to provide polyadenylation of the mKNA and termination. Other 

20 promoters that have the additional advantage of transcription controlled by growth 
conditions are the promoter region for alcohol dehydrogenase 2, isocytochrome C, 
acid phosphatase, degradative enaymes associated with nitrogen metabolism, and the 
aforementioned gIyceraIdehyde-3-phosphate dehydrogenase, and enzymes responsible 
for maltose and galactose utilization. Any plasmid vector containing yeast-compatible 

25 promoter, origin of replication and termination sequences is suitable. 

Ceil cultures derived from multicellular organisms, such as plants, may be used 
as hosts to practice this invention. Transgenic plants can be obtained, for example, by 
transferring plasmids that encode (!!^-P-famesene synthase and a selectable marker 
gene, e,g,, the kan gene encoding resistance to kanamycin, into Agrobacterium 

30 iumifaciens contaimng a helper Ti plasmid as described in Hoeckema ^ al., Netture, 
303:179-181 [1983] and culturing ihe Agrobacterium cells with leaf slices, or otiier 
tissues or cells, of the plant to be transformed as described by An et al*, Pkmt 
Physiology, 81:301-305 [1986], Transformation of cultured plant host cells is 
normally accomplished through Agrobacterium tumifadens. Cultures of mammalian 

35 host cells and other host cells that do not have rigid cell membrane barriers are usually 
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transformed using the calcium phosphate method as originally described by Graham 
and VanderEb {Virology^ 52:546 [1978]) and modified as described in 
sections 16.32-16.37 of Sambrook et al supra. However, other methods for 
introducing DNA into cells such as Polybrene (Kawai and Nishizawa, MoL Cell 
5 Biol, 4:1172 [1984]), protoplast fiision (Schaffher, Proc. Natl Acad, Set USA^ 
77:2163 [1980]), electroporation (Neumann et al, EMBO J,, 1:841 [1982]), and 
direct microinjection into nuclei (Capecchi, CelU 22:479 [1980]) may also be used. 
Additionally, animal transformation strategies are reviewed in Monastersky G,M- and 
Robl, LM., Strategies in Transgenic Animal Science, ASM Press, Washington, D,C., 

10 1995, incorporated herein by reference. Transformed plant calK may be selected 
through the selectable marker by gro\^ng the cells on a medium containing, e.g., 
kananq^cin, and appropriate amounts of phytohormone such as naphthalene acetic 
acid and benzyladenine for callus and shoot induction. The plant cells may then be 
regenerated and the resulting plants transferred to soil using techniques well known to 

15 those skilled in the art. 

In addition, a gene regulating f:^^3-famesene synthase production can be 
incorporated into the plant along with a necessary promoter which is inducible. In the 
practice of this embodiment of the invention, a promoter that only responds to a 
specific external or internal stimulus is &sed to the target cDNA. Thus, the gene will 

20 not be transcribed except in response to the specific stimulus. As long as the gene is 
not being transcribed, its gene product and enzyme product are not produced. 

An illustrative example of a responsive promoter system that can be used in 
the practice of this invention is the glutathione-S-transferase (GST) system in maize. 
GSTs are a family of enzymes that can detoxify a number of hydrophobic electrophilic 

25 compounds that often are used as pre-emergent herbicides (Weigand et al.. Plant 
Molecular Biology, 7:235-243 [1986]). Studies have shown that the GSTs are 
<&rectiy involved in causing tlus enhanced herbicide tolerance. This action is primarily 
mediated through a specific 1. 1 kb mKNA transcription product. In shorty maize has 
a naturally occurring quiescent gene already present that can respond to external 

30 stimuli and that can be induced to produce a gene product. This gene has previously 
been identified and cloned. Thus, in one embodiment of this invention, the promoter 
is removed fi-om the GST responsive gene and attached to an ^-j3-famesene 
sjmthase gene that previously has had its native promoter removed. This engineered 
gene is the combination of a promoter that responds to an extern^ chemical stimulus 

35 and a gene responsible for successfijl production of (2J^-p"famesene synthase. 
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In addition to the methods described above^ several methods are known in the 
art for transferring cloned DNA into a wide variety of plant species, including 
gymnosp^ms, angiosperms, monocots and dicots (see, e.g., Glick and 
Thompson, eds,. Methods in Plant Molecular Biology^ CRC Press, Boca Raton, 
5 Florida [1993], incorporated by reference herein). Representative ^^amples include 
electroporation-&cilitated DNA uptake by protoplasts in which an electrical pulse 
transiently permeabilizes cell membranes, permitting the uptake of a variety of 
biological molecules, including recombinant DNA (Rhodes et al.. Science^ 
240:204-207 [1988]); treatment of protoplasts with polyethylene glycol (Lyznik et al., 

10 Plant Molecular Biology, 13:151-161 [1989]); and bombardment of cells mthDNA- 
laden microprojectiles which are propelled by explosive force or compressed gas to 
penetrate the cell wall (Klein et al., Plant Physiol 91:440-444 [1989] and 
Boyntonetal, Science, 240:1534-1538 [1988]). Transformation of Taois species 
can be achieved, for example, by employing the methods set forth in Han et al, Plant 

15 Science, 95:187-196 (1994), incorporated by reference herein. A method that has 
been applied to Rye plants {Secale cereale} is to directly inject plasmid DNA, 
including a selectable marker gene, into developing floral tillers (de la Pena et al,. 
Nature 325:274-276 (1987)). Further, plant viruses can be used as vectors to transfer 
genes to plant cells. Examples of plant viruses that can be used as vectors to 

20 transform plants include the Cauliflower Mosaic Virus (Brisson et al.. Nature 310: 
511-514 (1984); Additionally, plant transformation strategies and techniques are 
reviewed in Birch, R.G., Ann Rev Plant Phys Plant Mol Biol^ 48:297 (1997); 
Forester et al, Exp, Agric, 33:15-33 (1997). The aforementioned publications 
disclosing plant transformation techniques are incorporated herein by reference, and 

25 minor variations make these technologies applicable to a broad range of plant species. 

Each of these techniques has advantages and disadvantages. In each of the 
techniques, DNA from a plasmid is genetically engineered such that it contains not 
only the gene of interest, but also selectable and screenable marker genes. A 
selectable marker gene is used to select only those cells that have integrated copies of 

30 the plasmid (the construction is such that the gene of interest and the selectable and 
screenable genes are transferred as a unit). The screenable gene provides another 
check for the successfiil culturing of only those cells carrying the genes of interest. A 
commonly used selectable marker gene is neomydn phosphotransferase n (NPT II), 
This gene conveys resistance to k^namyctn, a compound that can be added directly to 

35 the growth media on which the cdis grow. Plant cells are normally susc^tible to 
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kananqrcin and, as a result, die. The presence of the NPT n gene overcomes the 
effects of the kanamycin and each cell with this gene remains viable. Another 
selectable marker gene which can be employed in the practice of tWs invention is the 
gene which confers resistance to the herbicide glufosinate (Basta). A screenable gene 
5 commonly used is the P-glucuronidase gene (GUS). The presence of this gene is 
characterized using a histochemical reaction in which a sample of putatively 
transformed cells is treated with a GUS assay solution. After an appropriate 
incubation, the cells containing the GUS gene turn blue. 

The plasmid containing one or more of these genes is introduced into either 
10 plant protoplasts or callus cells by any of the previously mentioned techniques. If the 
marker gene is a selectable gene, only those cells that have incorporated the DNA 
package survive under selection with the appropriate phytotoxic agent. Once the 
appropriate cells are identified and propagated, plants are regenerated. Progeny from 
the transfonned plants must be tested to insure that the DNA package has been 
15 successfully integrated into the plant genome. 

Mammalian host cells may also be used in the practice of the invention. 
Examples of suitable mammalian cell lines include monkey kidney CVI line 
transformed by SV40 (COS-7, ATCC CRL 1651); human embryonic kidney line 293S 
(Graham etal, J. Gen, Virol, 36:59 [1977]); baby hamster kidney cells (BHK, 
20 . ATCCCCL 10); Chinese hamster ovary cells (Uriah and Chasin, Proc. Natl Acad, 
Sci USA 77:4216 [1980]); mouse Sertoli cells (TM4, Mather, Biol Reprod, 23:243 
[1980]); monkey kidney cells (CVI*76, ATCC CCL 70); African green monkey 
kidney cells (VERO-76, ATCC CRL-1587); human cervical carcmoma cells (HELA, 
ATCC CCL 2); camne kidney cells (MDCK, ATCC CCL 34); buflSiio rat liver cells 
25 (BKL 3A, ATCC CRL 1442); human king ceUs (W138, ATCC CCL 75); human Uver 
cells (Hep G2, HB 8065); mouse mammaty tumor cells (MMT 060562, 
ATCC CCL 51); rat h^atoma cells (HTC, MI.54, Baumann et al.-, Jl CellBiol^ 85:1 
[1980]); and TRI cells (Mather et al.. Annals KY. Acad Sci., 383:44 [1982]). 
Expression vectors for these cells ordinarily include (if neces^ry) DNA sequences for 
30 an origin of replication, a promoter located in front of the gene to be e?qpressed, a 
ribosome binding site, an KNA splice site, a polyadenylation site, and a transcription 
terminator site- 
Promoters used in mammalian expression vectors are often of viral origin. 
These viral promoters are conmionly derived from polyoma virus, Adenovims 2, and 
35 most frequently Simian Virus 40 (SV40). The SV40 virus contains two promoters 
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that are termed the early and late promoters. These promoters are particularly useful 
because they are both easily obtained from the virus as one DNA fragment that also 
contains the viral origin of replication (Fiers et al., NaOire^ 273:113 [1978]). Smaller 
or larger SV40 DNA fragments may also be used, provided they contain the 
S approximately 2S0-bp sequence extending from the HindlQ she toward the BgK site 
located in the viral origin of replication. 

Alternatively^ promoters that are naturally associated with the foreign gene 
(homologous promoters) may be used provided that they are compatible with the host 
cell line selected for transformation. 

10 An origin of replication may be obtained from an exogenous source, such as 

SV40 or other virus (e.g.. Polyoma, Adeno, VSV, BPV) and inserted into the cloning 
vector. Alternatively, the origin of replication may be provided by the host cell 
chromosomal replication mechanism. If the vector containing the foreign gene is 
integrated into the host cell chromosome, the latter is often sufficient. 

IS The use of a secondary DNA coding sequence can enhance production levels 

of (!:^-|3-&mesene synthase in transformed cell lines. The secondary coding sequence 
typically comprises the enzyme dihydrofolate reductase (DEDFR). The wild-type form 
of DHFR is normally inhibited by the chemical methotrexate (MTX), The level of 
DHFR expression in a cell will vary depending on the amount of MTX added to the 

20 cultured host cells. An additional feature of DHFR that makes it particularly use&l as 
a secondary sequence is that it can be used as a selection marker to identify 
transformed cells. Two forms of DHFR are available for use as secondary sequences^ 
wild-type DHFR and MTX-resistant DHFR. The type of DHFR used in a particular 
host cell depends on whether the host cell is DHFR deficient (such that it either 

25 produces very low levels of DHFR endogenously, or it does not produce functional 
DHFR at all). DHFR-deficient cell lines such as the CHO ceil line described by 
Urlaub and Chasin, supra, are transformed with wild-type DHFR coding sequences. 
After transformation, these DHFR-deficient cell lines express fimctional DHFR and 
are capable of growing in a culture medium lacking the nutrients hypoxanthme, 

30 glycine and thymidine. Nontransformed cells will not survive in this medium. 

The MTX-resistant form of DHFR can be used as a means of selecting for 
transformed host cells in those host cells that endogenously produce normal amounts 
of fimctional DHER that is MTX sensitive. The CHO-Kl cell ime (ATCC No. CL 61) 
possesses these characteristics, and is thus a usefiil cell line for this purpose. The 

35 addition of MTX to the cell culture medium will permit only those cells transfi^imed 
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with the DNA encoding the MTX-resistant DHFR to grow. The nontransformed cells 
will be unable to survive in this medium, 

Prokaryotes may also be used as host cells for the initial cloning steps of this 
invention. They are particularly useful for rapid production of large amounts of DNA, 
5 for production of single-stranded DNA templates used for site-directed mutagenesis^ 
for screening many mutants simultaneously, and for DNA sequencing of the nuitants 
generated. Suitable prokaryotic host cells include E. coli K12 strain 94 (ATCC 
No. 31,446), Rcoli strain W3110 (ATCC No. 27,325) K coli XI 776 (ATCC 
No- 31,537), and E. coli B; however many other strains of E. coli, such as HBIOI, 

10 JMlOl, NM522, 1SIM538, NM539, and many other species and genera of prokaryotes 
including badlli such as Bacillus subtiliSy other enterobacteriaceae such as Salmonella 
typhimurium or Serratia marce^ms^ and various Pseudomonas species may all be 
used as hosts, Prokaryotic host cells or other host cells with rigid cell walls are 
preferably transformed using the calcium chloride method as described in section 1.82 

15 of Sambrook et aL, supra. Alternatively, electroporation may be used for 
transformation of these cells. Prokaryote transformation techniques are set forth in 
Dower, W.J., in Genetic Engineering, Principles and Methods, 12:275-296, Plenum 
Publishing Corp., 1990; Hanahan et di.^Meth EnzymoL, 204:63 (1991). 

As a representative example, cDNA sequences encoding (^-p-famesene 

20 synthase may be transferred to the (His)5*Tag pET vector commercially available 
(from Novagen) for overexpression in E. coli as heterologous host. This pET 
expression plasmid has several advantages in high level heterologous egression 
systems. The desired cDNA insert is ligated in frame to ptasmid vector sequences 
encoding six histidines followed by a highly specific protease recognition site 

25 (thrombin) that are joined to the anuno terminus codon of the target protdn. The 
tiistidine "block" of the expressed fosion protein promotes very tig^t Ending to 
immobilized metal ions and pemirts rapid purification of the recombmant protein by 
immobilized metal ion afiSnity chromatograpi^. The histidine leader s^umce is then 
cleaved at the specific proteolysis site by treatment of the purified protein with 

30 thrombin, and the (3£^-P'famesene synthase again purified by immobilized metal ion 
affinity chromatography, this time using a shallower imidazole gradient to elute the 
recombinant synthases while leaving the histidine block still adsori>ed. This 
overexpression-purification system has high capacity^ excellent resolving power and is 
fast, and the chance of a contaminating E. coli protein exhibiting similar binding 

35 behavior (before and after thrombin proteolysis) is extremely small. 
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As vnU be apparent to those skilled in the art, any plasmid vectors containing 
repltcon and control sequences that are derived from species compatible with the host 
cell may also be used in the practice of the invention. The vector usually has a 
replication site, marker genes that provide phenotypic selection in transformed cells, 
5 one or more pronK>ters, and a polylinker region containing several restriction sites for 
msertion of foreign DNA, Plasmids typically used for transfonnation of E. coli 
mclude pBR322, pUC18, pUC19, pUCIIS, pUC119, and Bluescript M13, all of 
which are described in sections 1.12-1.20 of Sambrook et al., supra. However, many 
other suitable vectors are available as well. These vectors contain genes coding for 

10 ampicillin and/or tetracycline resistance which enables cells transformed with these 
vectors to grow in the presence of these antibiotics. 

The promoters most commonly used in prokaryotic vectors include the 
p^actamase (penicillinase) and lactose promoter systems (Chang etal. Nature^ 
375:615 [1978]; Itakura et al.. Science, 198:1056 [1977]; Goeddel et al.. Nature, 

15 281:544 [1979]) and a triptophan (trp) promoter system (Goeddel et al., NncL Acids 
Res., 8:4057 [1980]; EPO AppL Publ. No. 36,776), and the alkaline phosphatase 
systems. While these are the most cotnmonly used, other microbial promoters have 
been utilized, and details concerning their nucleotide sequences have been pubfished, 
enabling a skilled worker to ligate them functionally into plasmid vectors (see 

20 Siebenlist et al.. Cell, 20:269 [1980]). 

Many eukaryotic proteins normally secreted from the cell contain an 
endogenous secretion signal sequence as part of the amino acid sequence. Thus, 
proteins normally found in the cytoplasm can be targeted for secretion by linking a 
signal sequence to the protein. This is readily accomplished by Ugating DNA 

25 encoding a signal sequence to the 5" end of the DNA encoding the protein and then 
expressing this fiision protein in an appropriate host cell The DNA encoding the 
signal sequence may be obtained as a restriction fragment from any gene encoding a 
protein with a signal sequence. Thus, prokaryotic, yeast, and eukaryotic signal 
sequences may be used herein, depending on the type of host cell utilized to practice 

30 the invention. The DNA and amino add sequence encoding the signal sequence 
portion of several eukaryotic genes including, for example, Imman growth hormone, 
proinsulin^ and prealbumin are known (see Stry^, Biochemistry W,H. Freeman and 
Company, New York, NY, p. 769 [1988]), and can be used as signal sequences in 
appropriate eukaryotic host cells. Yeast signal sequences, as for example add 

35 phosphatase (ArimaetaL, Nuc. Acids Res.^ 11:1657 [1983]), a-£actor, alkaline 
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phosphat^e and invertase may be used to direct secretion from yeast host cdls. 
Prokaryotic signal sequences from genes encoding, for example, LamB or OmpF 
(Wong et aL, Gene, 68:193 [1988]), MalE, PhoA, or betaJactamase, as well as other 
genes, may be used to target proteins from prokaryotic cells into the culture medium, 
5 TraflScking sequences from plants, animals and microbes can be employed in 

the practice of the invention to direct the (^^-p-famesene synthase proteins of the 
present invention to the c5^oplasm, endoplasmic reticuluni, mitochondria or other 
cellular components, or to target the protein for export to the medium. These 
considerations apply to the overexpression of (ZJ^-^-femesene synthase, and to 

10 direcdon of expression within cells or intact organisms to permit gene product 
function in any desired location. 

The construction of suitable vectors containing DNA encoding replication 
sequences, regulatory sequences, phenotypic selection genes and the (^-p-&mesene 
synthase DNA of interest are prepared using standard recombinant DNA procedures. 

15 Isolated plasmids and DNA fragments are cleaved, tailored, and ligated together in a 
specific order to generate the desired vectors, as is well known in the art (see, for 
example, Sambrook et al, supra). 

As discussed above, f^-p-famesene synthase variants are preferably produced 
by means of mutation(s) that are generated using the method of site-specific 

20 mutagenesis. This method requires the synthesis and use of specific oligonucleotides 
that encode both the sequence of the desired mutation and a sufBcient number of 
adjacent nucleotides to allow the oligonucleotide to stably hybridize to the DNA 
template. 

The foregoing may be more fully understood in connection with the following 
25 representative examples, in which "Plasmids" are designated by a lower case p 
followed by an alphanumeric designation. The starting plasmids used in this inv^ition 
are either commercially available, publicly available on an unrestricted basis, or can be 
constnicted from such available plasmids using published procedures. In addition, 
other equivalent plasmids are known in the art and will be apparent to the ordinary 
30 artisan. 

"Digestion", "cutting" or "cleaving" of DNA refers to catalytic cleavage of the 
DNA with an enzyme that acts only at particular locations in the DNA. These 
en^nmes are called restriction endonucleases, and the site along the DNA sequence 
where each enzyme cleaves is called a restriction site. The restriction enzymes used in 
35 thi^ invention are commercially available and are used according to the instructions 
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supplied by the manufacturers. (See also sections 1.60-1,61 and sections 3.38-3.39 of 
Sambrook et al., supra,) 

'TElecovery" or "isolation" of a given fragment of DNA from a restriction 
digest means separation of the resulting DNA fragment on a polyacrylamide or an 
5 agarose gel by electrophoresis, identification of the fragment of interest by 
comparison of its mobility versus that of marker DNA fragments of knoivn molecular 
weight, removal of the gel section containing the desired fragment, and separation of 
the gel from DNA. This procedure is known generally. For example, see Lawn et al. 
{Nucleic Acids Res.^ 9:6103-6114 [1982]), and Goeddeletal, (Nucleic Acids Res., 
10 supra). 

The following examples merely illustrate the best mode now contemplated for 
practicing the invention, but should not be construed to limit the invention. 

Example 1 

Essential Oil Analysis and Cell-Free Assay 

IS Plant Material and Reagents, Unless stated otherwise, the following plant 

materials and reagents were used in the experiments reported in this and succeeding 
Examples. Mentha x piperita L, cv. "Black Mitcham* was propagated from rhizomes 
as previously described (Gershenzon, L, McCaskill, D., Rajaonarivony, J. I. M,, 
Mthaliak, C, Karp, F. and Croteau, R. (1992) Anal Biochem, 200, 130-138), Tlie 

20 preparations of [l-^HQgeranyl diphosphate (GDP) (250 Ci/mol), [l-3H]&mesyl 
diphosphate (FDP) (125 Ci/mol), and [l-^H]geranylgeranyl diphosphate (GGDP) 
(118Ci/moi) have been previously reported (Croteau, R., Alonso, W, R., Koepp, 
A. and Johnson, M. A. (1994) ArcK Biochem. Biophys, 309, 184-192; Dixit, 
V. M., Laskovics, V. M., Noall, W, L and Poulter, C, D. (1981) J, Org. Chem. 46, 

25 1967-1969; LaFever, R. E., StofcrVogel, B. and Croteau, R. (1994) Arch. Biochem. 
Biophys. 313, 139-149). Terpenoid standards were from our own collection or were 
prepared from plant material purchased locally, a-Famesene was a gift from Dr. J. 
Brown (Washington State University), 5-cadinene was a gift from Dr. M. Essenberg 
(Oklahoma State University), and commercially steam distilled peppermint oil was a 

30 gift from I, P, Callison and Sons, Inc., Chehalis, WA. All other biochemicals and 
reagents were purchased from Sigma Chemical Co. or Aldrich Chetmcal Co., miless 
otherwise noted. 

Sesquiterpene Analysis. Unless stated otherwise, the following procedure 
was utilized to analyze sesquiterpene content and composition in the experiments 
35 reported in this and succeeding Examples. Young, mature peppermint leaves were 
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harvested and hydrodistiUed from ]SIH4HC03-bufFered water with simultaneous 
pentane extraction (Maarse, H. and Kepner, R, E. (1970) J. Agr. Chem. 18, 1095- 
1101). The organic phase was passed through a column of MgS04-silica gel 
{Mallinckrodt SiIicAR-60) to provide the olefin fraction for GC-MS analysis. 
5 Authentic (2y-P-famesene was prepared by pentane extraction (followed by silica gel 
fractionation) of macerated ginger (Zingiber officinale) root, black pepper oleoresin 
{Piper nigrum), bergamot oil (Citrus bergamoi), parsley oil (Petroselinum crispum\ 
or field-grown (Yakima Valley, WA) commercial peppermint oil (Lawrence, B. 
(1972) Ann. Acad Bras. Ciena, 44, (suppL), 191-197); all of these sources are 
10 reported to contain (^^-jj-famesene. 

Instrumental Analysis. The following instrumentation was utilized in this 
Example and all succeeding Examples, unless stated otherwise, Radio-GC was 
performed on a Gow-Mac 550P instrument (He carrier 40 ml/min, injector 220**C, 
detector 250°C and 150 mA) attached to a Packard 894 gas proportional counter. 
15 The column (3.18 mm i,d, by 3,66 m stainless steel with 15% polyethylene glycol 
ester (ATIOOO Alltech) on Gas Chrom Q was programmed from 150**C (5 min. hold) 
to 220°C at S^C/min. Thermal conductivity and radioactivity outputs were monitored 
after calibration with an external radiochemical standard, and --20,000 dpm of tritiated 
product was injected with data analysis using Turbochrome Navigator ven 4.1 
20 software (Perkin-Elmer). Liquid scintillation counting was performed in 
toluene:ethanol (70:30, v/v) containing 0.4% Omnifluor (DuPont NEN) using a 
Packard 460 CD spectrometer (^H efficiency -43%). GC-MS analysis employed a 
Hewlett-Packard 6890-5972 system vwth a SMS capillary column (0.25 mm Id. by 
30 m with 0.25 Tm coating of 5% phenyl methyl siloxane). Injections were made 
25 cool on-column at 40**C with oven programming from 40**C (50**C/nun) to 50°C (5 
min hold), then I0°C/min to 250**C, tiien 50**C/min to SOO^'C. Separations were made 
under a constant flow of 0.7 ml He/min. Mass spectral data were collected at 70 eV 
and analyzed using Hewlett-Packard Chemstation software. 

Cell-Free Assays. Peppermint oil gland secretory cells were isolated from 
30 immature leaves as previously described (Gershenzon, X, McCaskill, D,, 
Rajaonarivony, J. I, M., Mihaliak, Karp, F. and Croteau, R. (1992) Anal. 
Biochem. 200, 130-138, incorporated herein by reference) and sonically disrupted 
(Braun-Sonic 2000 microprobe at maximum power for three 30-second bursts with 
30-second chilling period at 0"4^C between bursts) into assay buffer consisting of 
35 25 mM Mopso (pH 7.0), 10 niM sodium ascorbate, 25 mM KCl, 10 mM DTT and 



BNSDOCID: <WO_ 



_99teiiaAlJ_> 



wo 99/18118 



"28- 



FCT/US98/20885 



10% glycerol, and supplemented with 0.5% (w/v) PVPP and 1% (w/v) Amberlite 
XAD-4 polystyrene resin. The sonicate was centrifuged at 3700 x for 15 minutes, 
and an aliquot of the supernatant was then placed in a 10 ml screw-capped glass test 
tube containing divalent metal ions (10 mM MgCU and 1 mM MnCl^ and substrate 
5 (7.3 |jM [l-^HjFDP), The aqueous layer was overlaid with 1 ml pentane and the 
sealed tube was incubated at 30^C for two hours. The pentane overlay -was then 
collected and the aqueous layer was extracted twice (1 ml) with pentane. The 
combined pentane extracts were passed through an anhydrous ^N%S04-silica gel 
column to obtain the labeled hydrocarbon fraction for GC-MS analysis^ or for radio* 

10 GrC analysis using peppermint oil as an internal standard. 

Essential Oil Analysis. To assess the probable abundance of f^-p-famesene 
synthase in peppermint gland secretory cells, the exclusive site of essential oil 
biosynthesis (Gershenzon, J., McCaskiH, D., Rajaonarivony, J. L M., Mihaliak, 
Karp, F. and Croteau, R, (1992) AnaL Biochem. 200, 130-138X the sesquiterpene 

15 olefin fraction of fteld-distilled peppermint oil was analyzed by GC-MS and shown to 
contain p-caryophyllene (39%), y-cadinene (33%), P-bourbonene (11%), <^*p- 
famesene (2.9%), 5-cadinene (2.0%), germacreneD (1.3%X copaene (1,3%) and 
a-humulene (1,2%) (FIGURE 1), as well as several other minor components (<1% 
each). GC-MS analysis of the oil distilled firom greenhouse material revealed a sintular 

20 composition, except that the amount of y-cadinene was higher (53%), P-bourbonene 
was conspicuously absent, and the (E^-P-famesene content was 3,4%. Although (E)- 
P-fimesene was not one of the more prominent sesquiterpenes of peppermint, the 
abundance was sufficient to suggest that cloning of the corresponding synthase by 
random sequencing of an enriched, oil gland cDNA library might be possible. 

25 Cell-free extracts. To gain a preliminary assessment of the target activity, 

cell-firee extracts of peppermint oil gland secretory cells (Gershenzon, J., McCasldll, 
D,, Rajaonarivony, J, I. M,, Mhaliak, C, Karp, F, and Croteau, R. (1992) AnaL 
Biochem. 200, 130-138)> were assayed for the divalent metal ion-d^endent 
conversion of [1-^HQfamesyl diphosphate to sesquiterpene olefins (Cane> D. E. (1990) 

30 Chem. itev. 90, 1089-1103). Radio-GC analysis of the derived biosyntfaetic products 
(FIGURE 2) revealed the presence of two major components identified as 
caryophyllene and y-cadinene. However, the separation of the labeled olefins was 
insufficient to resolve (I^-p-famesene from caryophyllene, or 5-cadlnene-fi:om y- 
cadinene. Both of these minor components appear at the trailing edges of the major 

35 peaks but are, nevertheless, coincident with the authentic standards, indicatii^ the 
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corresponding biosynthetic capability. No p-bourbonene was synthesized from EDP 
by this system. 

Example 2 

Qoning and Expression in E,coli of a cPNA Encoding /E)-B-Farnescne 
5 Synthase (SEP IP NO:l> 

Library Construction and Clone Identification. Initial cloning of full* 
length terpenoid biosynthetic genes from the peppermint oil gland cDNA library was 
successful and established a very high degree of enrichment for these target 
sequences. For example, the monoterpene cyclase, limonene synthase (Colby, S. M., 

10 Alonso, W, R,, Katahira, E. 1, McGarvey, D. L and Croteau, R. (1993) J. Biol 
Ckem. 268, 23016-23024), represents approximately 4% of the library. Hiis fact, 
plus the availability of automated sequencing capability, led to the possibility of 
randomly sequencing the library in search of cDNA species encoding other terpenoid 
synthases, including the (^-f^-famesene synthase which was shovm to be operational 

IS in tMs plant by both sesquiterpene analysis and cell-free assay. 

An enriched cDNA library was constructed from peppermint secretory cell 
clusters consisting of the eight glandular cells subtending the oil droplet. These cell 
clusters were harvested by a leaf surface abrasion technique (Gershenzon, J., 
McCaskill, D., Rajaonarivony, J. I. M., Mihaliak, C, Karp, F. and Croteau, R. (1992) 

20 Anal Biochem, 200, 130-138), and the RNA contained therein was isolated using the 
protocol of Logemann et al. (Logemann, J., Schell, J. and Willmitzer, L. (1987) Anal 
Biochem, 163, 16-20). mRNA was purified by oligo-dT cellulose chromatography 
(Pharmacia), and 5 jag of mRNA was used to construct a XZAPII cDNA libraiy 
according to the manufacturer's instmctions (Stratagene). 

25 Plasmids were excised from the library en mass and used to transform E. coli 

strain XLOLR as per the manufacturer's instructions (Stratagene), Appro?dmatefy 
150 individual plasmid-bearing strains were grovm in 5 ml LB media ovemigiht, and 
the corresponding plasmids were purified using a Qiawell 8 Ultraptasmid Kit (Qiagen) 
before partial 5 -sequencing by the Dye-Deoxy™ method using an ABI Sequenator at 

30 the Laboratory for Biotechnology and Bioanaiysis at Washington State University, 
Putative terpenoid synthase genes were identified by sequence comparison using the 
BLAST program of the GCG Wisconsin Package ver. 8. Bluescript plasmids 
harboring unique fiiU-length cDNA inserts with high similarity to known plant 
terpenoid synthases were tested for functional expression following transformation 

35 into E. coli XLl-Blue cells. A single extract, from the bacteria containing clone p43. 
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including the cDNA insert sequence set forth in SEQ ID NO:l, produced a 
sesquiterpene olefin from [l-^HjFDP, and this clone was selected for fiirtha- study. 

Bacterial Expression and Characterization of (!E)-p-Farae$ene Synthase 
(SEQ ID NO:2), K coli XLl-Blue harboring p43 (including the cDNA insert 
5 sequence set forth in SEQ ID NO: 1), or empty pBluescript plasmid as a control, were 
grown overnight at 37°C in LB medium containing 100 fig ampici&in/ml. A 50 (xl 
aliquot of the overnight culture was used to inoculate 5 ml of fresh LB medium, and 
the culture was grown at TH^C with vigorous agitation to A^oq 0 5 before induction 
with 1 mM IPTG. After an additional two hours of growth, the suspension was 

10 centrifuged (1000 x g^, ISmin, A'^C), the media removed, and the pelleted cells 
resuspended in I ml of cold assay buffer containing 1 mM EDTA. The cells were 
disrupted by sonication with a microprobe as previously described, except that only 
two 20-second bursts were employed. The chilled sonicate was cleared by 
centrifiigation and the supernatant was assayed for sesquiterpene synthase activity as 

15 before, or for monoterpene synthase activity (with 4.5 jxM [1-3h]GDP) or diterpene 
synthase activity (with 10 jiM [1-^H]GGDP). In all cases, the. pentane-soluble 
reaction products were purified by MgS04-&]lica gel chromatography^ as above, to 
prepare the olefin fraction for further analysis. 

A cell-free ejdract of £L coli XL*1 Blue cells harboring the plasmid p43 

20 (inchiding the cDNA insert sequence set forth in SEQ ID NO: 1) was prepared and 
shown to be capable of catalyzing the divalent metal ion-dependent conversion of [1- 
^HGjFDP to labeled sesquiterpene olefins. Radio-GC analysis (data not shown) and 
GC-MS analysis (FIGURE 3) of this sesquiterpene olefin fraction demonstrated that 
the major biosynthetic product (85%) was {!^-p-famesene by matching of both 

25 retention time and mass spectrum to those of the authentic standard obtained from 
several natural sources. Lesser amounts of (Z)-P-famesene (8%) and 5-cadinene 
(5%), as well as three other minor products (less than 1% each; all seemingly of the 
cadinene-type based on MS), were also produced. Control reactions^ employing 
extracts of XLl-Blue cells transformed with pBluescript lacking the cDNA insert 

30 having the sequence set forth in SEQ ID NO: 1, evidenced no detectable production of 
sesquiterpene olefins from thereby demonstrating that a cDNA clone 

^coding |!£^-p-faniesene synthase liad been acquired. 

Multiple product formation is a common feature of the terpenoid synthases^ 
and may be a consequence of the dectrophilic reaction mechanism catalyzed by these 

35 enzymes m which highly reactive carbocationic intermediates are generated (Cane, 
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D.E. (1990) Chem, Rev, 90, 1089-1103; Croteau, R. (1987) Chem. Rev. «7, 929- 
954). <:^-3-famesene is one of the simplest sesquiterpene olefins that can be derived 
from FDP, in a reaction involving divalent metal ion-assisted ionization of the 
diphosphate ester and deprotonation fi-om the C-3 methyl of the resulting carbocation 
5 (FIGURE 4). The formation of 5-cadinene (FIGURE 4) involves a considerably more 
eTctended reaction sequence, in which a preliminary isomerization step (to nerolidyl 
diphosphate) is required to permit the ionization-dependent cyclization to the 
macrocycle, followed by 1,3-hydride shift, closure of the second ring, and 
deprotonation to the bicyclic product The small amount of 5-cadinene produced by 
10 the recombinant synthase (SEQ ID NO:2) from FDP is interesting in light of the 
abundance of this bicyclic olefin in the sesquiterpene firaction of peppermint oil and 
the efficient production of this olefin in oil gland extracts; these observations suggest 
that an additional and distinct S-cadinene synthase must operate in peppermint. 

The recombinant (E>P-famesene synthase (SEQ ID NO:2) was inactive with 
15 the C20 substrate analog [1-^H]GGDP, but was able to catalyze the divalent catton- 
dq>endent conversion of the C^q analog [l-^HJGDP to monoterpene olefins. 
Although the rate of conversion of GDP to these products was less than 3% of the 
rate of conversion of FDP to sesquiterpene olefins at saturation, a more diverse 
spectrum of products was formed (see FIGURE 5 for structures). The cyclic 
20 monoterpenes limonene (48%) and terpinolene (15%), and the acyclic monoterpene 
analog of p-famesene, myrcene (15%), were the most abundant products as 
determined by both radio-GC and GC-MS analysis (data not shown). Lesser amounts 
of y-terpinene (7%), (Z)-ocimene (6%), f2:>ocimene (7%), and sabinene (3%) were 
also observed as products. Control reactions, employing extracts of XLl-Blue cells 
25 transformed with pBluescript lacking the insert, evidenced no detectable production 
of monoterpene olefins fcom [1-^H]GDP, thereby confirming that the monoterpene 
synthase activity expressed fix^m p43 was a fijnction of the (I^-p-famesene synthase 
(SEQ ID NO:2). This is the first report describing the utilization of GDP by a 
sesquiterpene synthase. Because monoterpene biosynthesis is localized to plastids, as 
30 is diterpene biosjmthesis, whereas sesquiterpene biosynthesis occurs in the cytoplasm 
(Chappell, J. (1995) Annu. Rev, Plant Physiol Plant Mol Biol 46, 521-547), the 
utilization of GDP as a substrate by (2y-P-famesene synthase is unlikely to be of 
physiological relevance and may simply reflect the lack of evolutionary pressure to 
discern the chain length of this isoprenoid substrate to which the enzyme is not 
35 exposed in vivo. 
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Example 3 

Sequence Analysis of the p43 cDNA Insert fSEp ID NQ;1> 
Complete sequendng of the ^-P-famesene synthase cDNA (SEQ ID NO:l) 
contained in p43 revealed an insert size of 1959 bp encoding an open reading frame of 
5 550 amino acids wth a deduced molecular wdght of 63,829. A putative starting 
methionine codon was identified which was out of fi-ame with the vector |^ 
galactosidase starting methionine; however^ a fortuitous stop codon in the 5'-- 
untranslated region, 46 bp upstream of the synthase translation start site and in j&ame 
with the p-galactosidase fusion sequence, allowed polydstronic translation of the 

10 cDNA free of vector-derived sequence. The deduced amino acid sequence of the (E)- 
p-famesene synthase (SEQ ID NO:2) lacks a plastidial targeting peptide (Keegstra, 
K., Olsen, J J. and Theg, S. M. (1989) jRev. Plant Physiol Plant Mol Biol 40, 
471-501), typical of monoterpene and diterpene synthases (Colby, S. M., Alonso, 
W. R., Katahira, E. J., McGarvey, J. and Croteau, R. (1993) J, Biol Chem. 268, 

15 23016-23024; Stofer Vogel, B., Wildung, M. R,, Vogel, G, and Croteau, R. (1996) J. 
Biol Chem, 271, 23262-23268; Wildung, M. R. and Croteau, R. (1996) J, Biol 
Chem, 271, 9201-9204), but consistent with all known plant-derived sesquiterpene 
synthases (Fachinni, P. J. and Chappell, J. (1992) Proc. Natl. Acad Sd, USA 89, 
11088-11092; Back, K, and Chappell, J. (1995) J. Biol Chem. 270, 7375-7381; 

20 Chen, X. Y., Chen, Y., Heinstdn, P. and Davisson, V. J. (1996) Arch Biochem, 
Biophys. 324, 255-266) which are directed to the c3^oplasm. Like all other known 
terpenoid synthases, ^35^-P-famesene synthase (SEQ ID NO:2) is rich in tryptophan 
(1.8%) and arginine (5.5%) residues, and bears a DDXXD motif (residues 301- 
305)(SEQ ID NO:3) which is believed to coordinate the divalent metal ion chelated to 

25 the substrate diphosphate group (Marrero, O. F., Pouiter, C. D, and Edwards, P. A. 
(1992) J, Biol Chem. 267, 21873-21878); the enzyme (SEQ ID NO:2) has a deduced 
isoelectric point at pH 5,16, 

The deduced amino acid sequence of the famesene synthase (SEQ ID NO:2) is 
most similar to that of the sesquiterpene cyclase €p2-aristolochene synthase from 

30 tobacco (FacWnni, P. J. and Chappell, J, (1992) Proc. Natl Acad Sci USA 89, 
11088-11092) in exhibiting 62% similarity (S) and 49% identity (1). This peppermint 
synthase (SEQ ID NO:2) also closely resembles the three oth^^ known angiosperm 
sesquiterpene cyclases (vetispiradiene synthase from Hyoscyamus muticus ^a(±, K. 
and Chappell, J. (1995) J. Biol Chem, 270, 7375-7381) at 63% S and 40% 1, 6- 

35 cadmene ^thase from cotton (Chen, X. Y, Chen, Y., Heinstein, P. and Davisson, 
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V. I (1996) Arch Biochem. Biophys, 324, 255*266) at 60% S and 37% I, and 
gmnacrene C synthase from tomato at 57% S and 34% I (unpublished), and also the 
diterpene cyclase, casbene synthase (Mau, C. J. D. and West, C. A. (1994) Proc. 
Nail Acad, Sci, USA 91, 8497-8501), from castor bean (at 61 % S and 35% I). Since 
5 {E>3"famesene synthase (SEQ ID N0:2) produces a small amount of 5-cadinene, but 
cannot be the major source of d-cadinene in peppermint, it is tempting to speculate 
that the famesene synthase (SEQ ID NO:2) represents either a progenitor, or an 
altered form of cadinene synthase in which the ability to catalyze the more complex 
bicyclization reaction has been lost. 
10 Surprisingly, ^E^-P-famesene synthase (SEQ ID NO:2) is no more closely 

related to monoterpene synthases from the Lamiaceae (1"^^^^^ synthase from 
spearmint (Colby, SJM., Alonso, W. Katahira, E, J,, McGarv^, D, J, and Croteau, 
R. (1993) J. Biol Chem. 268, 23016-23024) with 51% S and 30% I; sabinene 
synthase and 1,8-cineole synthase from culinary sage with 50% S and 29% I each ) 
15 than to the various terpenoid synthases from the gymnosperm Abies grafuSs 
(monoterpene synthases with 49% S and 28% I (Bohlmann, Steele, C. L. and 
Croteau, R. (1997) J. Biol Chem, 272, 21784-21792); sesquiterpene synthases with 
53% S and 29% I; diterpene synthases with 51% S and 28% I (Stofer Vogel, B,, 
Wildung, M. R,, Vogel, G. and Croteau, R. (1996) Jl Biol Chem. 271, 23262- 
20 23268). Even a phylogenetically distant diterpene cyclase from Taxus brevifolia^ 
taxadiene synthase (Wildung, M,R. and Croteau, R. (1996) X Biol Chem. 271, 
9201-9204X resembles (jE>p-famesene synthase (SEQ ID NO:2) at the amino acid 
level (50% S and 24% I) as closely as do the monoterpene synthases of the mint 
family. These sequence-based relationships may reflect a bifurcation in the evolution 
25 of the monoterpene synthases from the father terpenoid synthases that is as ancient as 
the separation between the angiosperms and gymnosperms. 

Example 4 

Characterization of ffij-B^Fameseae Svothase fSEO ID NO:2> 

For determination of the pH optimum of <35J-p-femesene ^thase (SEQ ID 
30 NO:2), the preparalion was adjusted with 50 niM Mopso (to a pH of 6.5, 6.75, 7,0, 
7.25, 7.5, 8.0, or 8.5) before the assay. Kinetic constants for EDP, GDP, Mg"^ and 
Mn"^ were determined using a preparation of (E>J3-famesene synthase (SEQ ID 
NO:2) that was partially purified by anion-exchange chromatography (on a Mono-Q 
column (Pharmacia) equilibrated with assay buffer and eluted with a linear KCl 
35 gradient (0 to 500 mM) in assay buffer). The 210-230 mM fraction containing the 
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(^^-p-famesene synthase (SEQ ID NO:2) was used for kinetic evaluation of PDP and 
GDP as substrates (concentration range 0.31 to 20 jjM, vnth saturating Mg*^). Due 
to the tenacious binding of divalent cations by the synthase^ the partially purified 
enzyme (prepared in the presence of 10 mM EDTA) was dialyzed overnight against 
5 assay buffer containing 50 mM EDTA. The dialysate was buffer-exchanged by 
ultrafiltration (Amicon Centriprep 30^ 450 fold dilution), then desalted (Bio-Rad 
Econo-Pak lODG) into assay buffer. Kinetic constants for Mg"^ and Mr&^ (asssQr 
range 1 jiM to 2 mM of the chloride salts) were then detOTnined at 7.3 }iM 
[1-^H]FDP, Triplicate assays were conducted and control incubations (without 
10 enzyme) were included in all cases, A double reciprocal plot (Lineweaver, H. and 
Burk, D. (1934) J. Am, Chem, Soc, 56, 658-666) was generated for each averaged 
data set, and the equation of the best-fit line determined (Kaleidagraph ver 3.08, 
Synergy Software). 

The recombinant, partially purified (^-P-famesene synthase (SEQ ID NO:2) 
15 exhibited a broad pH optimum in the 6.75 to 7.25 range in Mopso bufifer. This 
observation is in agreonent with the studies of Salin et al. (Saiin, F., Pauly, G** 
Charon, I and Qeizes, M. (1995) J. Plant Phys. 146, 203-209) in which the purified 
(!]^-3*famesene synthase from maritime pine was shown to possess a pH optimum in 
the 7.0 to 7.3 range. The value for EDP with the recombinant synthase (SEQ ID 
20 NO:2) was calculated to be 0.6 pM, a value typical of other sesquiterpene synthases 
of plant origin (Cane, D, E, (1990) Chem. Rev. 90, 1089-1103) but lower than the 
value of 5 pM determined for the en2yme from maritime pine (Salin, F., Pauly, G,, 
Charon, J. and Gleizes, M. (1995) J. Plani Phys. 146, 203-209), Substrate 
concentrations in excess of 10 FDP evidenced slight inhibition of activity with the 
25 recombinant enzyme (SEQ ID NO;2). Although the relative velocity at saturating 
levels of CBDP was only 3% of the velocity with FDP for the recombinant synthase 
(SEQ ID NO:2), the calculated value for GDP (1.5 jiM) was only three-fold 
higher than that for FDP, suggesting that the binding of the C^q analog was 
reasonably efficient, 

30 A^^ value of 150 |iM was determined for Mg***^ (V^^j ^100), and a K^^ value 

of 7.0 pM was determined for Mn**^ (V^^j = 80). No inhibition of activity was 
observed at Mg"^ concentrations up to 10 mM; however, concentrations of Mn^ 
exceeding 20 pM resulted in a sharp decline in reaction velocity to a plateau (V^ = 
20) in the 0,25 to 2 mM range. Since the product (fistribution of the recombinant (E)- 

35 ^-famesene synthase (SEQ ID NO:2) had been initially determined in the presence of 
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excess Mg**^, the conversion of [1-^H]FDP was re-evaluated in the presence of Ma++ 
alone at apparent saturation (20 pM), The olefin products were again analyzed by 
GC-MS and found in this case to consist of 98% (S^^-P-famesene and 2% (Z)-p- 
famescne. No 6-cadinene, or other sesquiterpenes, were synthesized in this instance, 
5 indicating that a stnictural alteration in the binding of Mn"*^ to the substrate and/or 
enzyme (relative to Mg"*"*") improves the fidelity of the reaction. 

In operational characteristics (pH optimum, kinetic constants) and physical 
features (size, pi), the recombinant <2;>p-famesene synthase (SEQ ID NO:2) is a 
typical sesquiterpene synthase (Cane, D, E, (1990) Chem. Rev. 90, 1089-1103; 
10 Fachinni, P. J, and Chappell, J. (1992) Proc, Natl Acad. Sd, USA «9, 11088-11092; 
Back, K. and Chappell, J. (1995) J. Biol Chem. 270, 7375-7381; Chen, X, Y., Chen, 
Y„ Heinstein, P. and Davisson, V, J, (1996) Arch. Biochem. Fiophys. 324, 255-266), 
suggesting that the enzyme should be highly fonctional in planta. Given that this 
synthase (SEQ ID NO:2) will be targeted by default to the cj^plasm (Chappell, J, 
15 (1995) Annu. Rev. Plant Physioll Plant Mol Biol 46, 521-547; Keegstra, K,, Olsen, 
J. J. and Theg, S, M. (1989) ^wn. Rev. Plant Physiol Plant Mol Biol 40, 471-501), 
where the substrate arises from the mevalonate pathway, it should be possible to 
engineer virtually any plant for the production of (^-P-famesene in order to exploit 
the kairomonal and pheromonal properties of this natural product. 
20 Example 5 

Properties of fE>-B-Farnesene Synthase Proteins of the Present Invention 
The (35)-P-famesene synthase proteins of the present invention all require a 
divalent metal ion as a cofactor. Most tE>P-famesene synthase proteins of the 
present invention utilize either Mg** or Mn** as a cofactor Nonetheless, 
25 femesene synthase proteins of the present invraiion are inhibited at concentrations of 
Mn^ in excess of about 5 mM. 

<E;-P-famesene synthase proteins of the present invention have a pH optimum 
in the range of from about pH 5.5 to about pH 8.5, and a pi in the range of from 
about pH 4.5 to about pH 6.0. The Km(FPP) of ^E^-p-femesene synthase proteins of 
30 the pr^ent invention is less than about lOpM, while the Kcat(FPP) of (35)-3-famesene 
synthase proteins of the present invention is less than about 5/sec. The (E)-^^ 
famesene synthase proteins of the present invention exist as either monomers or 
homodimers, with the monomer having a molecular weight of from about 55 kD 
OdloDaltons) to about 65 kD, 

35 
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Example 6 

Hybridization of Peppermint /gJ-B-Farnesene Synthase cDNA fSEO ID NO;l^ 
to Other Nucleic Acid Sequences of the Present Invention 

The nucleic acid molecules of the present invention are capable of hybridLdng 
5 to the nucleic acid sequence set forth in SEQ ID NO:l, or to the complementary 
sequence of the nucleic acid sequence set forth in SEQ ID NO:l, under the following 
stringent hybridization conditions: incubation in 5 X SSC at 65**C for 16 hours, 
followed by washing under the following conditions: two washes in 2 X SSC at IS^C 
to 25^C for twenty minutes per wash; preferably, two washes in 2 X SSC at IS^'C to 
10 25°C for twenty minutes per wash, followed by one wash in 0.5 X SSC at 55°C for 
thirty minutes; most preferably, two washes in 2 X SSC at 18*^C to 25**C for fifteen 
minutes per wash, followed by two washes in 0,2 X SSC at 65**C for twetity minutes 
per wash. 

The ability of the nucldc acid molecules of the present invention to hybridize 

15 to the nucleic acid sequence set forth in SEQ ID NO:l, or to the complementary 
sequence of the nucleic acid sequence set forth in SEQ ID NO:l, can be determined 
utilizing the technique of hybridizing radiolabelled nucleic acid probes to nucleic acids 
immobilized on nitrocellulose filters or nylon membranes as set forth, for example, at 
pages 9.52 to 9.55 of Molecular Cloning, A Laboratory Manual (2nd edition), J. 

20 Sambrook, E,F, Fritsch and T. Maniatis eds, the cited pages of which are 
incorporated herein by reference. 

In addition to the nucleic acid sequence set forth in SEQ ID NO:l, examples 
of representative nucleic acid sequences of the present invention that encode a 
peppermint (EJ-^-^nnes^ne synthase protein and which hybridize to the 

25 complementary sequence of the nucleic acid sequence disclosed in SEQ ID NO:l are 
set forth in SEQ ID NO:4; SEQ ID K0:6; SEQ ID N0:8; SEQ ID NOilO; SEQ ID 
NO:12; SEQ ID N0:14; SEQ ID NO:16 and SEQ ID NO:18, With the exception of 
the nucleic acid sequence set forth in SEQ ID NO:l, the foregoing representative 
nucldc acid sequences of the present invention were generated using a computer. By 

30 utilizing the degeneracy of the genetic code, each of the foregoing, representative 
nucldc acid sequences has a different sequence, but each encodes the proteui set forth 
in SEQ ID NO:2. Thus, the identical (35J-P-famesene synthase protein sequence is set 
forth in SEQ ID N0:2, SEQ ID NO:5; SEQ ID NO:7; SEQ ID NO:9; SEQ ID 
NO: 11; SEQ ID NO: 13; SEQ ID NO: 15; SEQ ID NO: 17 and SEQ ID NO: 19. 



Y/O 99/18118 



-37- 



PCT/US98/20885 



Li addition to the protein sequence set forth in SEQ JD NO:2 examples of 
representative (55)-3-feniesene synthase proteins of the present invention are set forth 
in SEQ m NO:20; SEQ ID NO:21; SEQ ID NO:22; SEQ ID NO:23; SEQ ID 
lsfO:24; SEQ ID NO:25; SEQ ID NO;26, SEQ ID NO:27 and SEQ ID NO:28. With 

5 llie exception of the amino acid sequence set forth in SEQ ID NO:2, the foregoing 
representative amino acid sequences of the present invention were generated using a 
computer by making conservative amino acid substitutions. 

While the preferred embodiment of the invention has been illustrated and 
described, it will be appreciated that various changes can be made therein without 

10 departing from the spirit and scope of the invention. 
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The embodiments of the invention in which an exclusive property or privilege 
is claimed are defined as foliows: 

1. An isolated nucieic acid molecule encoding an (B)-p-famesene 
synthase protein, 

2. An isolated nucleic acid molecule of Claim 1 encoding an angiosperm 
(E)-p-famesene synthase protein, 

3. An isolated nucleic acid molecule of Claim 1 encoding a gymnosperm 
(E)-p-famesene synthase protein. 

4. An isolated nucleic acid molecule of Claim 1 encoding an essential oil 
plant species (E)-3*femesene synthase protein. 

5. An isolated nucleic acid molecule of Claim 1 encoding an (E)-^- 
famesene synthase protein fi-om the genus A/£/i^/fa. 

6. An isolated nucleic acid molecule of Claim 5 encoding an (E)-P- 
&mesene synth^e protein fi^om Mentha piperita. 

7. An isolated nucleic acid molecule of Claim 6 consisting of the nucleic 
acid sequence set forth in SEQ ID NO: 1 . 

8. An isolated nucleic acid molecule of Claim 1 encoding an (E)-p- 
femesene synthase protein having the amino add sequence set forth in SEQ ID NO:2, 

9. An isolated E-|5-famesene synthase protein, provided that said isolated 
(E)-p-fam^ene synthase protein is not native to Maritime pine, 

10. A gymnosperm (E)-|3-famesene synthase protein of Claim 9, 
1 L An angiosperm (E)-P-famesene synthase protein of Claim 9, 

12. An essential oil plant (E)'P-famesene synthase protein of Claim 9. 

13 . KMentha (E)-P-famesene synthase protein of Claim 9, 

14. KMentha piperita (E)-p-famesene synthase protem of Claim 13. 
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15. An (E)-3-famesene synthase protein of Claim 13, said protein 
consisting of the amino acid sequence set forth in SEQ ID NO:2. 

16. A replicable expression vector comprising a nucleic acid sequence 
encoding an (E)-P-famesene synthase protein. 

17. A replicable expression vector of Claim 16 comprising a nucleic acid 
sequence encoding an angiosperm (E)-p-famesene synthase protein. 

18. A replicable expression vector of Claim 16 comprising a nucleic acid 
sequence encoding a gymnosperm (E)-P-famesene synthase protein, 

19. A replicable expression vector of Claim 16 comprising a nucleic acid 
sequence encoding an essential oil plant (E)-p-famesene synthase protein. 

20. A replicable expression vector of Claim 16 comprising a nucleic acid 
sequence encoding a. Mentha (E)-P-femesene synthase protein. 

21. A replicable expression vector of Claim 16 comprising a nucldc acid 
sequence encoding a Mentha piperita (E)-p-famesene synthase protein. 

22. A replicable expres^on vector of Claim 16 comprising a nucleic add 
sequence consisting of the nucleic acid sequence set forth in SEQ ID NO:l, 

23. A host cell comprising a vector of Claim 16. 

24. A host cell comprising a vector of Claun 17, 

25 . A host cell compri^ng a vector of Claim 1 8. 

26. A host cell comprising a vector of Claim 19. 

27. A host cell compriang a vector of Claim 20, 

28. A host cell comprising a vector of Claim 21 . 

29. A host cell comprising a vector of Claim 22, 

30. A host cell of Claim 23, said host cell being a plant cell. 
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31. An isolated nucleic acid molecule that is capable of hybridizing to the 
nucldc acid molecule set forth in SEQ ID NO: 1, or to the complementary sequence of 
the nucleic add molecule set forth in SEQ ID NO:l, under stringent hybridization 
conditions. 
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SEQUENCE Z.I5TXNG 



<110> Croteau, Rodney B 
Wildung, Mark R 
Crock, John £ 



<120> Isolation and Bact:erial Expression of a Sesquiterpene 
Synthase cDNA Clone from Peppermint (Mentha x 
pipperita, L.) that Produces the Aphid Alarm Pheromone 
(E ) -beta-Farnesene 



<130> wsurl2882 

<140> 
<141> 



<150> 60/061,144 
<151> 1997-10-06 



<160> 28 



<170> Patentin Ver* 2-0 

<210> 1 
<211> 1959 
<212> DNA 

<213> Mentha piperita 

<220> 
<221> CDS 

<222> (71) . . {1720J 



<400> 1 

aaactctgca atttcatata taacatcata aaatcagaga gagagacaga gagtttgttg 60 



tagtgaaaaa atg get aca aac ggc gtc gta att agt tgc tta agg gaa 
Met Ala Thr Asn Gly Val Val lie Ser Cys Leu Arg Glu 
^ S 10 



109 



gta agg cca cct atg acg aag cat gcg eca age atg tgg act gat acc 157 
Val Arg Pro Pro Met Thr iys His Ala Pro Ser Met Trp Thr Asp Thr 
15 20 25 



ttt tct aac ttt tct ctt gac gat aag gaa caa caa aag tgc tea gaa ?05 
Phe Ser Asn Phe Ser Leu Asp Asp Lys Glu Gin Gin Lys Cys Ser Glu 
30 35 40 45 

acc ate gaa gca ctt aag caa gaa gca aga ggc atg ctt atg get gca 253 
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Thr lie 6lu Ala Leu Lys Gin Glu Ala Arg Gly Ket Leu Met Ala Ala 
30 55 60 

acc act cct etc caa caa atg aca eta ate gac act etc gag cgt ttg 301 
Thr Thr Pro Leu Gin Gin Met Thr Leu He Asp Thr Leu Glu Arg Leu 
6S 70 75 



gga ttg tct ttc cat ttt gag acg gag ate gaa tac aaa ate gaa eta 
Gly Leu Ser Phe His Phe eiu Thr Glu He Glu Tyr Lys He Glu Leu 
80 85 90 

ate aae get gca gaa gac gac ggc ttt gat ttg ttc get act get ctt 
He Asn Ala Ala Glu Asp Asp Gly Phe Asp Leu Phe Ala Thr Ala Leu 
95 100 105 

cgt ttc cgt ttg etc aga caa cat caa cgc cac gtt tct tgt gat gtt 
Arg Phe Arg Leu Leu Arg Gin His Gin Arg His Val Ser Cys Asp Val 
110 115 120 125 

ttc gac aag ttc ate gac aaa gat ggc aag ttc gaa gaa tee ctt age 
Phe Asp Lys Phe He Asp Lys Asp Gly Lys Phe Glu Glu Ser Leu Ser 
130 13S 

aat aat gtt gaa ggc eta tta age ttg tat gaa gca get cat gtt ggg 
Asn Asn Val Glu Gly Leu Leu Ser Leu Tyr Glu Ala Ala His Val Gly 
145 150 155 

ttt cgc gaa gaa aga ata tta caa gag get gta aat ttt acg agg eat 
Phe Arg Glu Glu Arg He Leu Gin Glu Ala Val Asn Phe Thr Arg His 
1«0 165 170 

cac ttg gaa gga gca gag tta gat cag tct cca tta ttg att aga gag 
His Leu Glu Sly Ala Slu Leu Asp Gin Ser Pro Leu Leu He Arg Glu 
175 180 185 

aaa gtg aag cga get ttg gag cac cct ctt cat agg gat ttc ece att 
Lys val Lys Arg Ala Leu Glu His Pro Leu His Arg Asp Phe Pro He 
ISO 195 200 205 

gtc tat gca cgc ctt ttc ate tec att tac gaa aag gat gac tct aga 
Val Tyr Ala Arg Leu Phe He Ser He Tyr Glu Lys Asp Asp Ser Arg 
210 215 220 

gat gaa tta ctt etc aag eta tec aaa gtc aac ttc aaa ttc atg cag 
Asp Glu Leu Leu Leu Lys Leu Ser Lys Val Asn Phe Lys Phe Met Gin 
225 230 235 

aat ttg tat aag gaa gag etc tec caa etc tec agg tgg tgg aae aca 
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349 



397 



445 



493 



541 



589 



637 



685 



733 



781 



829 



BNSDOCID: <WO g91811W1J_> 



wo 99/18118 PCT/US98/20885 



Ash Leu Tyr Lys Glu Glu Leu Ser Gin Leu Ser Arg Trp Trp Asn Tbr 
240 245 250 

tgg aat ctg aaa tea aaa tta cca tat gca aga gat cga gtc gtg gag 
Trp Asn Leu Lys Ser Lys Leu Pro Tyr Ala Arg Asp Arg Val Val Glu 
255 260 26S 



877 



get tat gtt tgg gga gta ggt tac cat tac gaa ccc caa tac tea tat 925 
Ala Tyr Val Trp Gly Val Gly Tyr His Tyr Glu Pro Gin Tyr Ser Tyr 
270 273 280 285 

gtt cga atg gga ctt gee aaa ggc gta eta att tgt gga ate atg gac 973 
Val Arg Met Gly Leu Ala Lys Gly Val Leu He Cys Gly He Met Asp 
290 295 300 

gat aca tat gat aat tat get aca etc aat gaa get caa ett ttt act 1021 
Asp Thr Tyr Asp Asn Tyr Ala Tiir Leu Asn Glu Ala Gin Leu Phe Thr 
305 310 315 

caa gtc tta gac aag tgg gat aga gat gaa get gaa cga etc cca gaa 10 69 
Gin Val Leu Asp Lys Trp Asp Arg Asp Glu Ala Glu Arg Leu Pro Glu 
320 325 330 

tac atg aaa ate gtt tat cga ttt att ttg agt ata tat gaa aat tat 1117 
Tyr Met Lys He Val Tyr Arg Phe He Leu Ser He Tyr Glu Asn Tyr 
335 340 

gaa cgt gat gca gcg aaa ctt gga aaa age ttt gca get cct tat ttt 1165 
Glu Arg Asp Ala Ala Lys Leu Gly Lys Ser Phe Ala Ala Pro Tyr Phe 

355 360 365 

aag gaa ace gtg aaa caa ctg gca agg gca ttt aat gag gag eag aag 1213 
Lys Glu Thr Val Lys Gin Leu Ala Arg Ala Phe Asn Glu Glu Gin Lys 
370 375 380 

tgg gtt atg gaa agg cag eta ccg tea ttc caa gac tac gta aag aat 1261 
Trp Val Met Slu Arg Gin Leu Pro Ser Phe Gin Asp Tyr Val Lys Asn 
305 390 395 

tea gag aaa acc age tgc att tat ace atg ttt get tct ate ate cca 1309 
ser Glu Lys Thr Ser Cys He Tyr Thr Met Phe Ala Ser He He Pro 
400 405 410 

ggc ttg aaa tct gtt acc caa gaa acc att gat tgg ate aag agt gaa 1357 
Gly Leu Lys Ser Val Thr Gin Glu Thr He Asp Trp He Lys Ser Glu 
415 420 425 

ccc acg etc gca aca teg acc get atg ate ggt egg tat tgg aat gac 1405 
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Pro Thr Leu Ala Thr Ser Thr Ala Met He Gly Arg Tyr Trp Asn Asp 

43S 440 445 

acc age tct cag etc cgt gaa age aaa gga ggg gaa atg ctg act gcg 
Thr Ser Ser Gin Leu Arg Glu Ser Lys Gly Gly Glu Met Leu Thr Ala 
450 455 460 

ttg gat ttc cac atg aaa gaa tat ggt ctg acg aag gaa gag gcg gca 
Leu Asp Phe His Met Lys Glu Tyr Gly Leu Thr Lys Glu Glu Ala Ala 
465 470 475 

tct aag ttt gaa gga ttg gtt gag gaa aca tgg aag gat ata aac aag 
Ser Lys Phe Glu Gly Leu Val Glu Glu Thr Trp Lys Asp He Asn Lys 
480 485 490 

gaa ttc ata gcc aca act aat tat aat gtg ggt aga gaa att gcc ate 
Glu Phe He Ala Thr Thr Asn Tyr Asn Val Gly Axg Glu He Ala He 
455 500 505 

aca ttc etc aac tac get egg ata tgt gaa gcc agt tac age aaa act 
Thr Phe Leu Asn Tyr Ala Arg He cys Glu Ala Ser Tyr Ser Lys Thr 
510 515 S20 525 

gac gga gac get tat tea gat cct aat gtt gcc aag gca aat gtc gtt 
Asp Gly Asp Ala Tyr Ser Asp Pro Asn Val Ala Lys Ala Asn Val Val 
530 535 

get etc ttt gtt gat gcc ata gtc ttt tgatttgcat aatcaaagac 
Ala Leu Phe Val Asp Ala He Val Phe 
545 550 



1453 



1501 



1549 



1597 



1645 



1693 



1740 



cctataatta taattatatg tgtttaagaa actaataagc ttgctttatg tatagttgtc 1800 
aattgaataa taatgtatta attagtagag ttaagaagtt ataaagaata aagaggagct 1860 
ggtagacgta aacaagaaat aatgtgtcaa aataacttca actttttcaa gaataaagaa 1920 
ttggaagaga ccaatatata caaaaaaaaa aaaaaaaaa 1959 



<210> 2 
<211> 550 
<212> PRT 

<213> Mentha piperita 
<400> 2 

Met Ala Thr Asn Gly Val Val He ser Cys Leu Arg Glu Val Arg Pro 
^5 10 15 
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Pro Met Thr Lys His Ala Pro Ser Met Trp Thr Asp Thr Phe Ser Asn 
20 25 30 

Phe Ser Leu Asp Asp Lys Glu Gin Gin Lys Cys Ser Glu Thr lie Glu 
35 40 45 

Ala Leu Lys Gin Glu Ala Arg Gly Met Leu Met Ala Ala Thr Thr Pro 
50 55 60 

Leu Gin Gin Met Thr Leu lie Asp Thr Leu Glu Arg Leu Gly Leu Ser 
" 70 75 80 

Phe His Phe Glu Thr Glu He Glu Tyr Lys He Glu Leu He Asn Ala 
85 90 95 

Ala Glu Asp Asp Gly Phe J\sp Leu Phe Ala Thr Ala Leu Arg Phe Arg 

100 105 no 

Leu Leu Arg Gin His Gin Arg His Val Ser Cys Asp Val Phe Asp Lys 
lis 120 125 

Phe He Asp Lys Asp Gly Lys Phe Glu Glu Ser Leu Ser Asn Asn Val 
130 135 2.40 

Glu Gly Leu Leu Ser Leu Tyr Glu Ala Ala His Val Gly Phe Arg Glu 

150 155 160 

Glu Arg He Leu Gin Glu Ala Val Asn Phe Thr Arg His His Leu Glu 
165 170 175 

Gly Ala Glu Leu Asp Gin Ser Pro Leu Leu He Arg Glu Lys Val Lys 
180 185 190 

Arg Ala Leu Glu His Pro Leu His Arg Asp Phe Pro He Val Tyr Ala 
155 200 205 

Arg Leu Phe He Ser He Tyr Glu Lys Asp Asp Ser Arg Asp Glu Leu 
210 215 220 

Leu Leu Lys Leu Ser Lys Val Asn Phe Lys Phe Met Gin Asn Leu Tyr 
225 230 235 240 

Lys Glu Glu Leu Ser Gin Leu Ser Arg Trp Trp Asn Thr Trp Asn Leu 
245 250 255 

Lys Ser Lys Leu Pro Tyr Ala Arg Asp Arg Val Val Glu Ala Tyr Val 
260 265 270 
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Trp Gly Val Gly Tyr His Tyr Glu Pro Gin Tyr Ser Tyr Val Arg Met 
275 280 285 

Gly Leu Ala Lys Gly Val Leu He Cys Gly lie Met Asp Asp Thr Tyr 
290 295 300 

Asp Asn Tyr Ala Thr Leu Asn Glu Ala Gin Leu Phe Thr Gin Val Leu 

310 315 320 

Asp Lys Trp Asp Arg Asp Glu Ala Glu Arg Leu Pro Glu Tyr Met Lys 
325 330 335 

He Val Tyr Arg Phe He Leu Ser He Tyr Glu Asn Tyr Glu Arg Asp 
340 345 350 

Ala Ala Lys Leu Gly Lys Ser Phe Ala Ala Pro Tyr Phe Lys Glu Thr 
355 360 365 

Val Lys Gin Leu Ala Arg Ala Phe Asn Glu Glu Gin Lys Trp Val Met 
370 375 

Glu Arg Gin Leu Pro Ser Phe Gin Asp Tyr Val Lys Asn Ser Glu Lys 

390 395 400 

Thr Ser Cys He Tyr Thr Met Phe Ala Ser He He Pro Gly Leu Lys 
405 410 415 

Ser Val Thr Gin Glu Thr He Asp Trp He Lys Ser Glu Pro Thr Leu 
420 425 430 

Ala Thr Ser Thr Ala Met He Gly Arg Tyr Trp Asn Asp Thr Ser Ser 
435 440 445 

Gin Leu Arg Glu Ser Lys Gly Gly Glu Met Leu Thr Ala Leu Asp Phe 
450 455 460 

His Met Lys Glu Tyr Gly Leu Thr Lys Glu Glu ALa Ala Ser Lys Phe 
4« 470 475 480 

Glu Gly Leu Val Glu Glu Thr Trp Lys Asp He Asn Lys Glu Phe He 
48S 490 495 

Ala Thr Thr Asn Tyr Asn Val Gly Arg Glu He Ala He Thr Phe Leu 
500 505 510 

Asn Tyr Ala Arg He Cys Glu Ala Ser Tyr Ser Lys Thr Asp Gly Asp 
315 520 525 
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Ala Tyr Ser Asp Pro Asn Val Ala I,ys Ala Asn Val Val Ala Leu Phe 
530 535 540 

Val Asp Ala lie Val Phe 
545 550 



<210> 3 
<211> S 
<212> PRT 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: conserved 
amino acid motif 

<220> 

<221> DOMAIN 
<222> (1) . . (5) 

<223> Conserved domain that may coordinate Jainding of 
divalent metal ion 

<400> 3 

Asp Asp xaa Xaa Asp 
1 5 



<210> 4 
<211> 1650 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: nucleic acid 
sequence encoding peppermint E-beta-f arnesene 
synthase protein 

<220> 

<221> CDS 

<222> (1),.(1650) 

<223> Computer-generated nucleic acid sequence encoding 
peppermint £-beta-famesene synthase protein 

<400> 4 

atg gca aca aac ggc gtc gta att agt tgc tta agg gaa gta agg cca 
Met Ala Thr Asn Gly Val Val lie Ser Cys Leu Arg Glu Val Arg Pro 
^5 10 15 
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cct atg acg aag cat gcg cca age atg tgg act gat acc ttt tct aac 96 
Pro Met Thr Lys His Ala Pro Ser Met Trp Thr Asp Thr Phe Ser Asn 
20 25 30 

ttt tct ctt gac gat aag gaa caa caa aag tgc tea gaa acc ate gaa 144 
Phe Ser Leu Asp Asp Lys Glu Gin Gin Lys Cys Ser Glu Thr lie Glu 
35 40 45 

gca ctt aag caa gaa gca aga ggc atg ctt atg get gca acc act cct 192 
Ala Leu Lys Gin Glu Ala Arg Gly Met Leu Met Ala Ala Thr Thr Pro 
50 55 60 

etc caa caa atg aca eta ate gac act etc gag cgt ttg gga ttg tct 240 
Leu Gin Gin Met Thr Leu lie Asp Thr Leu Glu Arg Leu Gly Leu Ser 
« 70 75 80 

ttc cat ttt gag acg gag ate gaa tac aaa ate gaa eta ate aac get 288 
Phe His Phe Glu Thr Glu lie Glu Tyr Lys lie Glu Leu lie Asn Ala 
85 90 95 

gca gaa gac gac ggc ttt gat ttg ttc get act get ctt cgt ttc cgt 336 
Ala Glu Asp Asp Gly Phe Asp Leu Phe Ala Thr Ala Leu Arg Phe Arg 
100 105 110 

ttg etc aga caa eat caa cgc cac gtt tct tgt gat gtt ttc gac aag 384 
Leu Leu Arg Gin His Gin Arg His Val Ser Cys Asp Vai Phe Asp Lys 
115 120 125 

ttc ate gac aaa gat ggc aag ttc gaa gaa tec ctt age aat aat gtt 432 
Phe lie Asp Lys Asp Gly Lys Phe Glu Glu Ser Leu Ser Asn Asn Val 
130 135 140 

gaa ggc eta tta age ttg tat gaa gca get cat gtt ggg ttt cgc gaa 480 
Glu Gly Leu Leu Ser Leu Tyr Glu Ala Ala His Val Gly Phe Arg Glu 
i45 150 155 160 

gaa aga ata tta caa gag get gta aat ttt acg agg cat cac ttg gaa 528 
Glu Arg lie Leu Gin Glu Ala Val Asn Phe Thr Arg His His Leu Glu 
165 170 175 

gga gca gag tta gat cag tct eea tta ttg att aga gag aaa gtg aag 576 
Gly Ala Glu Leu Asp Gin Ser Pro Leu Leu lie Arg Glu Lys Val Lys 
180 .185 190 

cga get ttg gag cac cct ctt cat agg gat ttc ccc att gte tat gca 624 
Arg Ala Leu Glu His Pro Leu His Arg Asp Phe Pro He Val Tyr Ala 
1^5 200 205 
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cgc ctt ttc ate tec att tac gaa aag gat gac tct aga gat gaa tta 672 
Arg I*eu phe lie Ser lie Tyr Glu Lys Asp Asp Ser Arg Asp Glu Leu 
210 215 220 

ctt etc aag eta tec aaa gtc aac ttc aaa ttc atg cag aat ttg tat 720 
Leu Leu Lys Leu Ser Lys Val Asn Phe Lys Phe Met Gin Asn Leu Tyr 
225 230 235 240 

aag gaa gag etc tec caa etc tec agg tgg tgg aac aca tgg aat ctg 768 
Lys Glu Glu Leu Ser Gin Leu Ser Arg Trp Trp Asn Thr Trp Asn Leu 
24S 250 255 

aaa tea aaa tta eca tat gca aga gat cga gtc gtg gag get tat gtt 816 
Lys Ser Lys Leu Pro Tyr Ala Arg Asp Arg Val Val Glu Ala Tyr Val 
260 265 270 

^99 gta ggt tac cat tac gaa ccc caa tac tea tat gtt cga atg 864 

Trp Gly Val Gly Tyr His Tyr Glu Pro Gin Tyr Ser Tyr Val Arg Met 
275 280 285 

gga ctt gcc aaa ggc gta eta att tgt gga ate atg gac gat aca tat 912 
Gly Leu Ala Lys Gly Val Leu lie Cys Gly lie Met Asp Asp Thr Tyr 
290 295 300 

gat aat tat get aca etc aat gaa get caa ctt ttt act caa gtc tta 960 
Asp Asn Tyr Ala Thr Leu Asn Glu Ala Gin Leu Phe Thr Gin Val Leu 
305 310 315 320 

gac aag tgg gat aga gat gaa get gaa cga etc cca gaa tac atg aaa 1008 
Asp Lys Trp Asp Arg Asp Glu Ala Glu Arg Leu Pro Glu Tyr Met Lys 
325 330 335 

ate gtt tat cga ttt att ttg agt ata tat gaa aat tat gaa cgt gat 1056 
lie Val Tyr Arg Phe He Leu Ser He Tyr Glu Asn Tyr Glu Arg Asp 
340 345 350 

gca gcg aaa ctt gga aaa age ttt gca get cct tat ttt aag gaa acc 1104 
Ala Ala Lys Leu Gly Lys Ser Phe Ala Ala Pro Tyr Phe Lys Glu Thr 
355 360 365 

gtg aaa caa ctg gca agg gca ttt aat gag gag cag aag tgg gtt atg 1152 
Val Lys Gin Leu Ala Arg Ala Phe Asn Glu Glu Gin Lys Trp Val Met 
370 375 380 

gaa agg cag eta ccg tea ttc caa gac tac gta aag aat tea gag aaa 1200 
Glu Arg Gin Leu Pro Ser Phe Gin Asp Tyr Val Lys Asn Ser Glu Lys 
385 390 395 400 
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acc age tgc att tat acc atg ttt get tct ate ate cca gge ttg aaa 1248 
Thr Ser Cys He Tyr Thr Met Phe Ala Ser He rie Pro Gly Leu Lys 
405 410 

tct gtt acc caa gaa acc att gat tgg ate aag agt gaa ccc acg etc 1296 
Ser Val Thr Gin Glu Thr He Asp Trp He hys Ser du Pro Thr Leu 
420 425 430 

gca aca teg acc get atg ate ggt egg tat tgg aat gac acc age tet 1344 
Ala Thr Ser Thr Ala Met He Gly Arg Tyr Trp Asn Asp Thr ser Ser 
435 440 445 

cag etc cgt gaa age aaa gga ggg gaa atg ctg act gcg ttg gat ttc 1392 
Gin I.eu Arg Slu Ser Lys Gly Gly Glu Met Leu Thr Ala Leu Asp Phe 

455 460 

cac atg aaa gaa tat ggt ctg acg aag gaa gag gcg gca tct aag ttt 1440 
His Met Lys Glu Tyr Gly Leu Thr Lys Glu Glu Ala Ala Ser Lys Phe 

470 475 480 

gaa gga ttg gtt gag gaa aca tgg aag gat ata aac aag gaa ttc ata 1488 
Glu Gly Leu Val Glu Glu Thr Trp Lys Asp He Asn Lys Glu Phe He 
485 490 495 

gee aca act aat tat aat gtg ggt aga gaa att gcc ate aca ttc etc 1536 
Ala Thr Thr Asn Tyr Asn Val Gly Arg Glu He Ala He Thr Phe Leu 
500 505 510 

aac tac get egg ata tgt gaa gcc agt tac age aaa act gac gga gac 1584 
Asn Tyr Ala Arg He Cys Glu Ala Ser Tyr Ser Lys Thr Asp Gly Asp 
515 520 525 

get tat tea gat cet aat gtt gcc aag gca aat gtc gtt get etc ttt 1632 
Ala Tyr Ser Asp Pro Asn Val Ala Lys Ala Asn Val Val Ala Leu Phe 
530 535 540 

gtt gat gcc ata gtc ttt ,^ca 
Val Asp Ala He Val Phe 
545 550 



<210> 5 
<211> S50 
<212> PRT 

<213> Artificial Sequence 
<400> 5 
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Met Ala Thr Asn Gly Val Val He Ser Cys Leu Arg Glu Val Arg Pro 
15 10 15 

Pro Met Thr Lys His Ala Pro Ser Met Trp Thr Asp Thr Phe Ser Asn 
20 25 30 

Phe Ser Leu Asp Asp Lys Glu Gin Gin Lys Cys Ser Glu Thr He Glu 
35 40 45 

Ala Leu Lys Gin Glu Ala Arg Gly Met Leu Met Ala Ala Thr Thr Pro 
SO 55 60 

Leu Gin Gin Met Thr Leu He Asp Thr Leu Glu Arg Leu Gly Leu Ser 
" 70 75 80 

Phe His Phe Glu Thr Glu He Glu Tyr Lys He Glu Leu He Asn Ala 
85 90 95 

Ala Glu Asp Asp Gly Phe Asp Leu Phe Ala Thr Ala Leu Arg Phe Arg 
100 105 110 

Leu Leu Arg Gin His Gin Arg His Val Ser Cys Asp Val Phe Asp Lys 
115 120 125 

Phe He Asp Lys Asp Gly Lys Phe Glu Glu Ser Leu Ser Asn Asn Val 
130 135 140 

Glu Gly Leu Leu Ser Leu Tyr Glu Ala Ala His Val Gly Phe Arg Glu 
l^S 150 155 160 

Glu Arg He Leu Gin Glu Ala Val Asn Phe Thr Arg His His Leu Glu 
165 170 175 

Gly Ala Glu Leu Asp Gin Ser Pro Leu Leu He Arg Glu Lys Val Lys 
180 185 190 

Arg Ala Leu Glu His Pro Leu His Arg Asp Phe Pro He Val Tyr Ala 
195 200 205 

Arg Leu Phe He Ser He Tyr Glu Lys Asp Asp Ser Arg Asp Glu Leu 
210 215 220 

Leu Leu Lys Leu Ser Lys Val Asn Phe Lys Phe Met Gin Asn Leu Tyr 
225 230 235 240 

Lys Glu Glu Leu Ser Gin Leu Ser Arg Trp Trp Asn Thr Trp Asn Leu 
245 250 255 
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Lys ser Lys Leu Pro Tyr Ala Arg Asp Arg Val Val Glu Ala Tyr Val 
260 265 270 

Trp Gly Val Gly Tyr His Tyr Glu Pro Gin Tyr Ser Tyr Val Arg Met 
275 280 285 

Gly Leu Ala Lys Gly Val Leu He Cys Gly He Met Asp Asp Thr Tyr 

295 300 



Asp Asn Tyr Ala Thr Leu Asn Glu Ala Gin Leu phe Thr Gin Val Leu 

320 



305 310 



Asp Lys Trp Asp Arg Asp Glu Ala Glu Arg Leu Pro Glu Tyr Met Lys 
325 330 

He Val Tyr Arg Phe He Leu Ser He Tyr Glu Asn Tyr Glu Arg Asp 
340 345 350 

Ala Ala Lys Leu Gly Lys Ser Phe Ala Ala Pro Tyr Phe Lys Glu Thr 
355 360 365 

Val Lys Gin Leu Ala Arg Ala Phe Asn Glu Glu Gin Lys Trp Val Met 
370 375 3Q0 

Glu Arg Gin Leu Pro Ser Phe Gin Asp Tyr Val Lys Asn Ser Glu Lys 

390 395 400 

Thr ser Cys He Tyr Thr Met Phe Ala Ser He He Pro Gly Leu Lys 
405 410 415 

Ser Val Thr Gin Glu Thr He Asp Trp He Lys Ser Glu Pro Thr Leu 
*20 425 430 

Ala Thr Ser Thr Ala Met He Gly Arg Tyr Trp Asn Asp Thr Ser Ser 
435 440 445 

Gin Leu Arg Glu Ser Lys Gly Gly Glu Met Leu Thr Ala Leu Asp Phe 
4S0 45S 460 

His Met Lys Glu Tyr Gly Leu Thr Lys Glu Glu Ala Ala Ser Lys Phe 

475 480 

Slu Gly Leu val Glu Glu Thr Trp Lys Asp He Asn Lys Glu Phe He 
485 490 495 

Ala Thr Thr Asn Tyr Asn Val Gly Arg Glu He Ala He Thr Phe Leu 
SOO 505 510 
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Asn Tyr Ala Arg lie cys Glu Ala Ser Tyr Ser l.ys Thr Asp Gly Asp 
515 520 525 

Ala Tyr Ser Asp Pro Asn Val Ala Lys Ala Asn Val Val Ala l,eu Phe 
530 535 540 

Val Asp Ala lie Val Phe 
545 550 



<210> 6 
<211> 1650 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of ArtiiiciaX Sequence: nucleic acid 
sequence encoding E -beta- fames ene synthase 
protein 

<220> 

<221> CDS 

<222> (1)..(1650) 

<223> Computer— generated nucleic acid sequence encoding 
peppermint E-beta-farnesene synthase protein 

<400> 6 

atg get aca aac ggc gtc gtc att agt tgc tta agg gaa gta agg cca 48 

Met Ala Thr Asn Gly Val Val lie Ser Cys l*eu Arg Glu Val Arg Pro 
15 10 15 

cct atg teg aag cat gcg cca age atg tgg act gat acc ttt tct aac 96 
Pro Met ser Lys His Ala Pro Ser Met Trp Thr Asp Thr Phe Ser Asn 
20 25 30 

ttt tct ctt gac gat aag gaa caa caa aag tgc tea gaa acc ate gaa 144 
Phe Ser Leu Asp Asp Lys Glu Gin Gin Lys cys Ser Glu Tlir lie Glu 
35 40 45 

gca ctt aag caa gaa gca aga ggc atg ctt atg get gca acc act cct 192 
Ala Leu Lys Gin Glu Ala Arg Gly Met Leu Met Ala Ala Thr Thr Pro 
SO 55 60 

etc caa caa atg aca eta ate gac act etc gag cgt ttg gga ttg tct 240 
Leu Gin Gin Met Thr Leu lie Asp Thr Leu Glu Arg Leu Gly Leu Ser 
65 70 75 80 

ttc cat ttt gag acg gag ate gaa tac aaa ate gaa eta ate aac get 288 
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Phe His Phe Glu Thr Glu He Glu Tyr Lya He Glu Leu He Asn Ala 
85 90 95 

gca gaa gac gac gge ttt gat ttg ttc get act get ctt cgt ttc cat 
Ala Glu Asp Asp Gly Phe Asp Leu Phe Ala Thr Ala Leu Arg Phe Arg 

lOS no 

ttg etc aga caa cat caa cgc cac gtt tct tgt gat gtt ttc gac aag 
Leu Leu Arg Gin His Gin Arg His Val Ser Cys Asp Val Phe Asp Lys 

120 

ttc ate gac aaa gat ggc aag ttc gaa gaa tec ctt age aat aat gtt 
Phe He Asp Lys Asp Gly Lys Phe Glu Glu Ser Leu Ser Asn Asn Val 
130 135 140 



gaa ggc eta tta age ttg tat gaa gca get cat gtt ggg ttt cgc gaa 
Glu Gly Leu Leu Ser Leu Tyr Glu Ala Ala His Val Gly Phe Arg Glu 

160 



336 



3S4 



432 



480 



145 150 



gaa aga ata tta caa gag get gta aat ttt acg agg cat cac ttg gaa 528 
Glu Arg He Leu Gin Glu Ala Val Asn Phe Thr Arg His His Leu Glu 
165 170 175 

gga gca gag tta gat cag tct cca tta ttg att aga gag aaa gtg aag 576 
Gly Ala Glu Leu Asp Gin Ser Pro Leu Leu He Ara Glu Lys Val Lys 

185 ' 190 

cga get ttg gag cac cct ctt cat agg gat ttc cce att gtc tat gca 624 
Arg Ala Leu Glu His Pro Leu His Arg Asp Phe Pro He Val Tyr Ala 
135 200 205 

cgc ctt ttc ate tec att tac gaa aag gat gac tct aga gat gaa tta 672 
Arg Leu Phe He Ser lie Tyr Glu Lys Asp Asp Ser Arg Asp Glu Leu 
210 215 220 

ctt etc aag eta tec aaa gtc aac ttc aaa ttc atg cag aat ttg tat 720 
Leu Leu Lys Leu Ser Lys Val Asn Phe Lys Phe Met Gin Asn Leu Tyr 

230 235 240 

aag gaa gag etc tec caa etc too agg tgg tgg aac aca tgg aat ctg 768 
Lys Glu Glu Leu Ser Gin Leu Ser Arg Trp Trp Asn Thr Trp Asn Leu 
245 250 255 

aaa tea aaa tta cca tat gca aga gat cga gtc gtg gag get tat gtt 816 
Lys Ser Lys Leu Pro Tyr Ala Arg Asp Arg Val Val Glu Ala Tyr Val 
260 265 270 

tgg gga gta ggt tac cat tac gaa cce caa tac tea tat gtt cga atg 864 
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Trp Gly Val Gly Tyr His Tyr Glu Pro Gin Tyr Ser Tyr Val Arg Met 
^275 280 285 



gga ctt gcc aaa ggc gta eta att tgt gga ate atg gac gat aca tat 
Gly Leu Ala Lys Gly Val Leu He Cys Gly He Met Asp Asp Thr Tyr 
290 295 300 



912 



gat aat tat get aca etc aat gaa get caa ctt ttt act caa gte tta 
Asp J\sn Tyr Ala Thr Leu Asn Glu Ala Gin Leu Phe Thr Gin Val Leu 
305 310 315 320 



960 



gac aag tgg gat aga gat gaa get gaa cga cte cca gaa tac atg aaa 1008 
Asp Lys Trp Asp Arg Asp Glu JUa Glu Arg Leu Pro Glu Tyr Met Lys 
325 330 335 

ate gtt tat cga ttt att ttg agt ata tat gaa aat tat gaa cgt gat 1036 
He Val Tyr Arg Phe He Leu Ser He Tyr Glu Asn Tyr Glu Arg Asp 
340 345 350 

gca gcg aaa ctt gga aaa age ttt gca get cct tat ttt aag gaa aec 1104 
Ala Ala Lys Leu Gly Lys Ser Phe Ala Ala Pro Tyr Phe Lys Glu Thr 
355 360 365 



gtg aaa caa etg gca agg gca ttt aat gag gag cag aag tgg gtt atg 
Val Lys Gin Leu Ala Arg Ala Phe Asn Glu Glu Gin Lys Trp Val Met 
370 375 380 



1152 



gaa agg cag eta ccg tea ttc caa gae tac gta aag aat tea gag aaa 
Glu Arg Gin Leu Pro Ser Phe Gin Asp Tyr Val Lys Asn Ser Glu Lys 
3S5 390 395 400 



1200 



acc age tgc att tat acc atg ttt get tct ate ate cca ggc ttg aaa 1248 
Thr ser Cys He Tyr Thr Met Phe Ala Ser He He Pro Gly Leu Lys 
405 410 415 

tct gtt acc caa gaa acc att gat tgg ate aag agt gaa ccc acg etc 1296 
Ser Val Thr Gin Glu Thr He Asp Trp He Lys Ser Glu Pro Thr Leu 
420 425 430 



gca aca teg acc get atg ate ggt egg tat tgg aat gac acc age tct 
Ala Thr Ser Thr Ala Met He Gly Arg Tyr Trp Asn Asp Thr Ser Ser 
435 440 445 



1344 



cag cte cgt gaa age aaa gga ggg gaa atg ctg act gcg ttg gat ttc 
Gin Leu Arg Glu Ser Lys Gly Gly Glu Met Leu Thr Ala Leu Asp Phe 
450 455 460 



1392 



cac atg aaa gaa tat ggt ctg acg aag gaa gag gcg gca tct aag ttt 
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His Met Lys GIu Tyr Gly Leu Thr Lys Glu Giu Ala Ala Ser Lys Phe 

470 475 

gaa gga ttg gtn gag gaa aca tgg aag gat ata aac aag gaa ttc ata 1488 
Glu Gly Leu Val Glu Glu Thr Trp Lys Asp lie Asn Lys Glu Phe He 
485 490 495 

gcc aca act aat tat aat gtg ggt aga gaa att gcc ate aca ttc etc 153 6 
Ala Thr Thr Asn Tyr Asn Val Gly Arg Glu lie Ala He Thr Phe Leu 
500 505 510 

aac tac get egg ata tgt gaa gcc agt tac age aaa act gac gga gac 1584 
Asn Tyr Ala Arg lie Cys Glu Ala Ser Tyr Ser Lys Thr Asp Gly Asp 
513 520 525 

get tat tea gat cct aat gtt gcc aag gca aat gtc gtt get etc ttt 1632 
Ala Tyr Ser Asp Pro Asn Val Ala Lys Ala Asn Val Val Ala Leu Phe 
530 535 540 

gtt gat gcc ata gtc ttt 1650 
Val Asp Ala lie Val Phe 
545 550 



<210> 7 
<211> 550 
<212> PRT 

<213> Artificial Sequence 
<400> 7 

Met Ala Thr Asn Gly Val Val lie Ser Cys Leu Arg Glu Val Arg Pro 
3-5 10 15 

Pro Met ser Lys His Ala Pro Ser Met Trp Thr Asp Thr Phe Ser Asn 
20 25 30 

Phe Ser Leu Asp Asp Lys Glu Gin Gla Lys Cys Ser Glu Thr He Glu 
35 40 45 

Ala Leu Lys Gin Glu Ala Arg Gly Met Leu Met Ala Ala Thr Thr Pro 
50 55 €0 

Leu Sin Gin Met Thr Leu He Asp Thr Leu Glu Arg Leu Gly Leu Ser 

^0 75 80 

Phe His Phe Glu Thr Glu lie Glu Tyr Lys He Glu Leu He Asn Ala 
85 90 95 
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Ala ©lu ASP Asp Gly Phe Asp Leu Phe Aia Thr Ala Leu Arg Phe Arg 
100 105 110 

Leu Leu Arg Gin His Gin Arg His Val Ser Cys Asp Val Phe Asp Lys 
115 120 125 

Phe lie Asp Lys Asp Gly Lys Phe Glu Glu Ser Leu Ser Asn Asn Val 
2-30 135 140 



Glu Gly Leu Leu Ser Leu Tyr Glu Ala Ala His Val Gly phe Arg Glu 

160 



145 150 j^55 



Glu Acg He Leu Gin Glu Ala Val Asn Phe Thr Arg His His Leu Glu 
16S 170 175 

Gly Ala Glu Leu Asp Gin ser Pro Leu Leu lie Arg Glu Lys Val Lys 
180 185 190 

Arg Ala Leu Glu His Pro Leu His Arg Asp Phe Pro He Val Tyr Ala 
195 200 205 

Arg Leu Phe He Ser He Tyr Glu Lys Asp Asp Ser Arg Asp Glu Leu 

215 220 

Leu Leu Lys Leu Ser Lys Val Asn Phe Lys Phe Met Gin Asn Leu Tyr 

230 235 240 

Lys Glu Glu Leu Ser Gin Leu Ser Arg Trp Trp Asn Thr Trp Asn Leu 
245 25 0 255 

Lys Ser Lys Leu Pro Tyr Ala Arg Asp Arg Val Val Glu Ala Tyr Val 
260 265 270 

Trp Gly Val Gly Tyr His Tyr Glu Pro Gin Tyr Ser Tyr Val Arg Met 
275 280 285 

Gly Leu Ala Lys Gly Val Leu He Cys Gly He Met Asp Asp Thr Tyr 
290 295 300 

Asp Asn Tyr Ala Thr Leu Asn Glu Ala Gin Leu Phe Thr Gin Val Leu 

310 315 320 

Asp Lys Trp Asp Arg Asp Glu Ala Glu Arg Leu Pro Glu Tyr Met Lys 
325 330 

He Val Tyr Arg Phe He Leu Ser He Tyr Glu Asn Tyr Glu Arg Asp 
340 345 350 
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Ala Ala Lys Leu Gly Lys Ser Phe Ala Ala Pro Tyr Phe Lys Glu Thr 
335 360 

Val Lys Gin Leu Ala Arg Ala Phe Asn Glu Glu Gin Lys Trp Val Mfet 
370 375 380 

Glu Arg Gin Leu Pro Ser Phe Gin Asp Tyr Val Lys Asn Ser Glu Lys 

390 395 

Thr Ser Cys He Tyr Thr Met Phe Ala Ser He lie Pro Gly Leu Lys 
405 410 

Ser Val Thr Gin Glu Thr He Asp Trp He Lys Ser Glu Pro Thr Leu 
420 425 430 

Ala Thr Ser Thr Ala Met He Gly Arg Tyr Trp Asn Asp Thr Ser Ser 
435 440 445 

Gin Leu Arg Glu Ser Lys Gly Gly Glu Met Leu Thr Ala Leu Asp Phe 
450 455 

His Met Lys Glu Tyr Gly Leu Thr Lys Glu Glu Ala Ala Ser Lys Phe 

470 475 480 

Glu Gly Leu Val Glu Glu Thr Trp Lys Asp He Asn Lys Glu Phe He 
485 490 495 

Ala Thr Thr Asn Tyr Asn Val Gly Arg Glu He Ala He Thr Phe Leu 
500 505 510 

Asn Tyr Ala Arg He Cys Glu Ala Ser Tyr Ser Lys Thr Asp Gly Asp 
515 520 525 

Ala Tyr Ser Asp Pro Asn Val Ala Lys Ala Asn Val Val Ala Leu Phe 
^30 535 540 

Val Asp Ala He Val Phe 
545 550 



<210> 8 
<211> 1650 
<212> DKA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: nucleic acid 
sequence encoding peppermint £-heta-farnesene 
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<220> 

<221> CDS 

<222> <1) . . (1650) 

<223> Computer— generated nucleic aci.d sequence encoding 
peppermint E-beta-farnesene synthase protein 

<400> 8 

atg get aca aac ggc gtc gta att agt tgc tta agg gaa gta agg cca 48 

Met Ala Thr Asn Gly Val Val lie Ser Cys I.eu Arg Glu Val Arg Pro 
^5 10 15 

cct atg acg aag eat gcg cca age atg tgg act gat acc ttt tct aac 96 
Pro Met Thr Lys His Ala Pro Ser Met Trp Thr Asp Thr Phe Ser Asn 
20 25 30 

ttt tct ctt gac gat aag gaa caa caa aag tgc tea gaa acc ate gaa 144 
Phe Ser Leu Asp Asp Lys Glu Gin Gin Lys Cys Ser Glu Thr He Glu 
35 40 45 

gca ctt aag caa gaa gca aga ggc atg ctt atg get gca acc act cct 192 
Ala Leu Lys Gin Glu Ala Arg Gly Met Leu Met Ala Ala Thr Thr Pro 
50 55 60 

etc caa caa atg aca eta ate gac act etc gag egt ttg gga ttg tct 240 
Leu Gin Gin Met Thr Leu He Asp Thr Leu Glu Arg Leu Gly Leu ser 
65 70 75 80 

ttc cat ttt gag acg gag ate gaa tac aaa ate gaa eta ate aac get 288 
Phe His Phe Glu Thr Glu lie Glu Tyr Lys He Glu Leu lie Asn Ala 
85 90 95 

gca gaa gac gac ggc ttt gat ttg ttc get act get ctt cgt ttc cgt 336 
Ala Glu Asp Asp Gly Phe Asp Leu Phe Ala Thr Ala Leu Arg Phe Arg 
100 105 110 

ttg etc aga caa cat caa cgc cac gtt tct tgt gat gtt ttc gac aag 384 
Leu Leu Arg Gin His Gin Arg His Val Ser Cys Asp Val Phe Asp Lys 
115 120 125 

ttc ate gac aaa gat ggc aag ttc gaa gaa tec ctt age aat aat gtt 432 
Phe He Asp Lys Asp Gly Lys Phe Glu Glu Ser Leu Ser Asn Asn Val 
130 135 140 



gaa ggc eta tta age ttg tat gaa gca get cat gtt ggg ttt cgc gaa 
Glu Gly Leu Leu Ser Leu Tyr Glu Ala Ala His Val Gly Phe Arg Glu 
145 150 155 160 



480 
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gaa aga ata tta caa gag get gta aat ttt acg agg cat cac ttg gaa 528 
Glu Arg lie i*eu Gin Glu Ala Val Asn Phe Thr Arg His His Leu Glu 
165 170 175 



gga gca gag tta gat cag tct cca tta ttg att aga gag aaa gtg aag 
Gly Ala Glu Leu Asp Gin Ser Pro Leu Leu lie Arg Glu Lys Val Lys 
180 185 ISO 



aag gaa gag etc tec caa etc tec agg tgg tgg aac aca tgg aat ctg 
Lys Glu Glu Leu Ser Gin Leu Ser Arg Trp Trp Asn Thr Trp Asn Leu 
245 250 255 



576 



cga get ttg gag cac cct ctt cat agg gat ttc ccc att gtc tat gca 624 
Arg Ala Leu Glu His Pro Leu His Arg Asp Phe Pro Xle Val Tyr Ala 
1^5 200 205 

cgc ctt ttc ate tec att tac gaa aag gaz gac tct aga gat gaa tta 672 
Arg Leu Phe lie Ser lie Tyr Glu Lys Asp Asp Ser Arg Asp Glu Leu 
210 215 220 

ctt etc aag eta tec aaa gtc aac ttc aaa ttc atg cag aat ttg tat 720 
Leu Leu Lys Leu Ser Lys Val Asn Phe Lys Phe Met Gin Asn Leu Tyr 
225 230 235 240 



768 



aaa tea aaa tta cca tat gca aga gat cga gtc gtg gag get tat gtt 816 
Lys Ser Lys Leu Pro Tyr Ala Arg Asp Arg Val Val Glu Ala Tyr Val 
2€0 265 270 

tgg gga gta ggt tac cat tac gaa ccc caa tac tea tat gtt cga atg 864 
Trp Gly Val Gly Tyr His Tyr Glu Pro Gin Tyr Ser Tyr Val Arg Met 
275 280 285 

gga ctt gcc aaa ggc gta eta att tgt gga ate atg gac gat aca tat 912 
Gly Leu Ala Lys Gly Val Leu He Cys Gly He Met Asp Asp Thr Tyr 
290 295 300 

gat aat tat get aca etc aat gaa get caa ctt ttt act caa gtc tta 960 
Asp Asn Tyr Ala Thr Leu Asn Glu Ala Gin Leu Phe Thr Gin Val Leu 
305 310 315 320 

gac aag tgg gat aga gat gaa get gaa cga etc cca gaa tac atg aaa 1008 
Asp Lys Trp Asp Axg Asp Glu Ala Glu Arg Leu Pro Glu Tyr Met Lys 
325 330 335 

ate gtt tat cga ttt att ttg agt ata tat gaa aat tat gaa cgt gat 1056 
He Val Tyr Arg Phe He Leu Ser He Tyr Glu Asn Tyr Glu Arg Asp 
340 345 350 
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gca gcg aaa ctt gga aaa age t.t:l: gca gat, cct: t:at: ttit aag gaa acc 1X04 
Ala Ala Lys Leu Gly Lys Ser Phe Ala Ala Pro Tyr Phe Lys Glu Thr 
355 360 365 

gtg aaa caa ctg gca agg gca ttt aat gag gag cag aag tgg gtt atg 1152 
Val Lys Gin Leu Ala Arg Ala Phe Asn Glu Glu Gin Lys Trp Val Met 
370 375 380 

gaa agg cag eta ccg tea ttc caa gac tac gta aag aat tea gag aaa 1200 
Glu Arg Gin Leu Pro Ser Phe Gin Asp Tyr Val Lys Asn Ser Glu Lys 
385 390 395 400 

acc age tgc att tat acc atg ttt get tct ate ate cca ggc ttg aaa 1248 
Thr Ser Cys lie Tyr Thr Met Phe Ala Ser He He Pro Gly Leu Lys 
405 410 415 

tct gtt acc caa gaa acc att gat tgg ate aag agt gaa ccc acg etc 1256 
Ser Val Thr Gin Glu Thr He Asp Trp He Lys Ser Glu Pro Thr Leu 
420 425 430 

gca aca teg acc get atg ate ggt egg tat tgg aat gac acc age tct 134 4 
Ala Thr Ser Thr Ala Met He Gly Arg Tyr Trp Asn Asp Thr Ser Ser 
435 440 445 

cag etc cgt gaa age aaa gga ggg gaa atg ctg act gcg ttg gat ttc 1392 
Gin Leu Arg Glu Ser Lys Gly Gly Glu Met Leu Thr Ala Leu Asp Phe 
4S0 455 460 

cac atg aaa gaa tat ggt ctg acg aag gaa gag gcg gca tct aag ttt 1440 
His Met Lys Glu Tyr Gly Leu Thr Lys Glu Glu Ala Ala Ser Lys Phe 
465 470 475 480 

gaa gga ttg gtt gag gaa aca tgg aag gat ata aac aag gaa ttc ata 1488 
Glu Gly Leu Val Glu Glu Thr Trp Lys Asp He Asn Lys Glu Phe He 
485 490 495 

gcc aca act caa tat aat gtg ggt aga gaa att gcc ate aca ttc etc 1536 
Ala Thr Thr Gin Tyr Asn Val Gly Arg Glu He Ala He Thr Phe Leu 
500 505 510 

aac tac get egg ata tgt gaa gcc agt tac age aaa act gac gga gac 1584 
Asn Tyr Ala Arg He Cys Glu Ala Ser Tyr Ser Lys Thr Asp Gly Asp 
515 520 525 

get tat tea gat cct aat gtt gcc aag gca aat gtc gtt get etc ttt 1632 
Ala Tyr Ser Asp Pro Asn Val Ala Lys Ala Asn Val Val Ala Leu Phe 
530 535 540 
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gtt gat gcc ata gtc ttt 
Val Asp Ala lie Val Phe 
545 550 



<210> 9 
<211> 550 
<212> PRT 

<213> Artificial Sequence 
<400> 9 

Met Ala Thr Asn Gly Val Val lie Ser Cys Leu Arg Glu Val Arg Pro 
^ 5 10 15 

Pro Met Thr Lys His Ala Pro Ser Het Trp Thr Asp Thr Phe Ser Asn 
20 25 30 

Phe Ser Leu Asp Asp Lys Glu Gin Gin Lys Cys Ser Glu Thr lie Glu 
35 40 45 

Ala Leu Lys Gin Glu Ala Arg Gly Met Leu Met Ala Ala Thr Thr Pro 
50 55 60 

Leu Gin Gin Met Thr Leu lie Asp Thr Leu Glu Arg Leu Gly Leu Ser 

70 75 80 

Phe His Phe Glu Thr Glu lie Glu Tyr Lys He Glu Leu lie Asn Ala 
85 90 95 

Ala Glu Asp Asp Gly Phe Asp Leu Phe Ala Thr Ala Leu Arg Phe Arg 
100 105 110 

Leu Leu Arg Gin His Gin Arg His Val Ser Cys Asp Val Phe Asp Lys 
115 120 125 

Phe He Asp Lys Asp Gly Lys Phe Glu Glu Ser Leu Ser Asn Asn Val 
130 135 140 

Glu Gly Leu Leu Ser Leu Tyr Glu Ala Ala His Val Gly Phe Arg Glu 
145 150 155 160 

Glu Arg He Leu Gin Glu Ala Val Asn Phe Thr Arg His His Leu Glu 
165 170 175 

Gly Ala Glu Leu Asp Gin ser Pro Leu Leu He Arg Glu Lys Val Lys 
180 185 190 

Arg Ala Leu Glu His Pro Leu His Arg Asp Phe Pro He Val Tyr Ala 
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195 200 205 

Arg Leu Phe lie Ser lie Tyr Glu iys Asp Asp Ser Arg A^sp Glu Leu 
?10 215 220 

Leu Leu Lys Leu Ser Lys Val Asn Phe hys Phe Met Gin Asn Leu Tyr 
225 230 235 240 

Lys Glu Glu Leu Ser Gin Leu Ser Arg Trp Trp Asn Thr Trp Asn Leu 
245 250 255 

Lys Ser Lys Leu Pro Tyr Ala Arg Asp Arg Val Val Glu Ala Tyr Val 
260 263 270 

Trp Gly Val Gly Tyr His Tyr Glu Pro Gin Tyr Ser Tyr Val Arg Met 
275 280 285 

Gly Leu Ala Lys Gly Val Leu lie Cys Gly lie Met Asp Asp Thr Tyr 
250 295 300 

Asp Asti Tyr Ala Thr Leu Asn Glu Ala Gin Leu Phe Thr Gin Val Leu 
305 310 315 320 

Asp Lys Trp Asp Arg Asp Glu Ala Glu Arg Leu Pro Glu Tyr Met Lys 
325 330 335 

He Val Tyr Arg Phe He Leu Ser lie Tyr Glu Asn Tyr Glu Arg Asp 
340 345 350 

r 

Ala Ala Lys Leu Gly Lys Ser Phe Ala Ala Pro Tyr Phe Lys Glu Thr 
355 360 365 

Val Lys Gin Leu Ala Arg Ala Phe Asn Glu Glu Gin Lys Trp Val Mfet 
370 375 380 

Glu Arg Gin Leu Pro Ser Phe Gin Asp Tyr Val Lys Asn Ser Glu Lys 
385 350 395 400 

Thr Ser Cys lie Tyr Thr Met Phe Ala Ser lie lie Pro Gly Leu Lys 
405 410 415 

Ser Val Thr Gin Glu Thr He Asp Trp He Lys Ser Glu Pro Thr Leu 
420 425 430 

Ala Tiir Ser Thr Ala Met He Gly Arg Tyr Trp Asn Asp Thr Ser Ser 
435 440 445 

Gin Leu Arg Glu Ser Lys Gly Gly Glu Met Leu Thr Ala Leu Asp Phe 
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450 455 460 

His Met Lys Glu Tyr Gly Leu Thr Lys Glu Glu Ala Ala Ser Lys Phe 
465 470 475 490 

Glu Gly Leu Val Glu Glu Thr Trp Lys Asp lie Asn Lys Glu Phe He 
485 490 435 

Ala Thr Thr Gin Tyr Asn Val Gly Arg Glu He Ala He Thr Phe Leu 
500 505 510 

Asn Tyr Ala Arg He Cys Glu Ala Ser Tyr Ser Lys Thr Asp Gly Asp 
515 520 525 

Ala Tyr Ser Asp Pro Asn Val Ala Lys Ala Asn Val Val Ala Leu Phe 
530 535 540 

Val Asp Ala He Val Phe 
545 550 



<210> 10 
<211> 1650 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: nucleic acid 
sequence encoding E-beta-- fa me sene synthase 
protein 

<220> 

<221> CDS 

<222> (1) . . (1650) 

<223> Coirputer-generateci nucleic acid sequence encoding 
peppermint E-beta-farnesene synthase protein 

<400> 10 

atg get aca aac ggc gtc gta att agt tgc tta agg gaa gta agg cca 48 

Met Ala Thr Asn Sly Val Val He Ser Cys Leu Arg Glu Val Arg Pro 

^5 10 15 



cct atg acg aag cat gcg cca age atg tgg act gat acc ttt tct aac 
Pro Met Thr Lys His Ala Pro Ser Met Trp Thr Asp Thr Phe Ser Asn 
20 25 30 

ttc tct ctt gac gat aag gaa caa caa aag tgc tea gaa acc ate gaa 
Phe ser Leu Asp Asp Lys Glu Gin Gin Lys Cys Ser Glu Thr He Glu 
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35 40 45 

gca ctt aag caa gaa gca aga ggc atg ctt atg get gca acc act cct 132 
Ala Leu Lys Gin Glu Ala Arg Gly Met Leu Met Ala Ala Thr Thr Pro 
50 55 60 

etc caa caa atg aca eta ate gac act etc gag cgt ttg gga ttg tct 24 0 
Leu Gin Gin Met Thr Leu lie Asp Thr Leu Glu Arg Leu Gly Leu Ser 
65 70 75 80 

ttc cat ttt gag acg gag ate gaa tac aaa ate gaa eta ate aac get 288 
Phe His Phe Glu Thr Glu lie Glu Tyr Lys lie Glu Leu lie Asn Ala 
85 90 95 

gca gaa gac gac ggc ttt gat ttg ttc get act get ctt cgt ttc cgt 336 
Ala Glu Asp Asp Gly Phe Asp Leu Phe Ala Thr Ala Leu Arg Phe Arg 
100 105 110 

ttg etc aga caa cat caa cgc cac gtt teg tgt gat gtt ttc gac aag 384 
Leu Leu Arg Gin His Gin Arg His Val Ser Cys Asp Val Phe Asp Lys 
115 120 125 

ttc ate gac aaa gat ggc aag ttc gaa gaa tec ctt age aat aat gtt 432 
Phe lie Asp Lys Asp Gly Lys Phe Glu Glu Ser Leu Ser Asn Asn Val 
130 135 140 

^aa ggc eta tta age ttg tat gaa gca get cat gtt ggg ttt cgc gaa 480 
Glu Gly Leu Leu Ser Leu Tyr Glu Ala Ala His Val Gly Phe Arg Glu 
145 150 155 160 

gaa aga ata tta caa gag get gta aat ttt acg agg cat cac ttg gaa 528 
Glu Arg He Leu Gin Glu Ala Val Asn Phe Thr Arg His His Leu Glu 
165 170 175 

gga gca gag tta gat cag tct cca tta ttg att aga gag aaa gtg aag 576 
Gly Ala Glu Leu Asp Gin Ser Pro Leu Leu He Arg Glu Lys Val Lys 
180 185 190 

cga get ttg gag cac cct ctt cat agg gat ttc ccc att gtc tat gca €24 
Arg Ala Leu Glu His Pro Leu His Arg Asp Phe Pro He Val Tyr Ala 
195 200 205 

cgc ctt ttc ate tec att tac gaa aag gat gac tct aga gat gaa tta 672 
Arg Leu phe He Ser He Tyr Glu Lys Asp Asp Ser Arg Asp Glu Leu 
210 215 220 

ctt etc aag eta tec aaa gte aac ttc aaa ttc atg cag aat ttg tat 720 
Leu Leu Lys Leu Ser Lys Val Asn Phe Lys Phe Met Gin Asn Leu Tyr 
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225 



230 



235 



240 



aag gaa gag etc tec caa etc tec agg tgg tgg aac aca tgg aat ctg 
Lys Glu Glu lieu Ser Gin Leu Ser Arg Trp Trp Asn Thr Trp Asn Leu 
245 250 255 



768 



aaa tea aaa tta cca tat gca aga gat cga gtc gtg gag get tat gtt 
Lys Ser Lys Leu Pro Tyr Ala Arg Asp Arg Val Val Glu Ala Tyr val 
260 265 270 



816 



tgg gga gta ggt tac cat tac gaa ccc caa tac tea tat gtt cga atg 864 
Trp Gly Val Gly Tyr His Tyr Glu Pro Gin Tyr Ser Tyr Val Arg Met 
275 280 285 

gga ctt gee aaa ggc gta eta att tgt gga ate atg gac gat aca tat 912 
Gly Leu Ala Lys Gly Val Leu lie cys Gly lie Met Asp Asp Thr Tyr 
290 295 300 

gat aat tat get aca etc aat gaa get caa ctt ttt act caa gtc tta 960 
Asp Asn Tyr Ala Thr Leu Asn Glu Ala Gin Leu Phe Thr Gin Val Leu 
305 310 315 320 

gac aag tgg gat aga gat gaa get gaa cga etc cca gaa tac atg aaa 1008 
Asp Lys Trp Asp Arg Asp Glu Ala Glu Arg Leu Pro Glu Tyr Met Lys 
325 330 335 



ate gtt tat cga ttt att ttg agt ata tat gaa aat tat gaa cgt gat 
Xle Val Tyr Arg Phe Xle Leu Ser lie Tyr Glu Asn Tyr Glu Arg Asp 
340 345 350 



1056 



gca gcg aaa ctt gga aaa age ttt gca get cct tat ttt aag gaa acc 
Ala Ala Lys Leu Gly Lys Ser Phe Ala Ala Pro Tyr Phe Lys Glu Thr 
355 360 365 



1104 



gtg aaa caa ctg gca agg gca ttt aat gag gag cag aag tgg gtt atg 
Val Lys Gin Leu Ala Arg Ala Phe Asn Glu Glu Gin Lys Trp Val Met 
370 375 380 



1152 



gaa agg cag eta ccg tea ttc caa gac tac gta aag aat tea gag aaa 
Glu Arg Gin Leu Pro Ser Phe Gin Asp Tyr Val Lys Asn Ser Glu Lys 
385 390 395 400 



1200 



acc age tge att tat acc atg ttt get tct ate ate cca ggc ttg aaa 
Thr Ser Cys lie Tyr Thr Met Phe Ala ser He lie Pro Gly Leu Lys 
405 410 415 



1248 



tct gtt acc caa gaa acc att gat tgg ate aag agt gaa ccc acg etc 
Ser Val Thr Gin Glu Thr lie Asp Trp He Lys Ser Glu Pro Thr Leu 



1296 
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420 

gca aca teg acc get atg 
Ala Thr Ser Thr Ala Met 
435 

cag etc cgt gaa age aaa 
Gin Leu Arg Glu Ser Lys 
450 

cac atg aaa gaa tat ggt 
His Met Lys Glu Tyr Gly 
463 470 

gaa gga ttg gtt gag gaa 
Glu Gly Leu Val Glu Glu 
485 

gcc aca act aat tat aat 
Ala Thr Thr Asn Tyr Asn 
500 

aac tac get egg ata tgt 
Asn Tyr Ala Arg lie Cys 
315 

get tat tea gat cct aat 
Ala Tyr Ser Asp Pro Asn 
530 

gtt gat gcc ata gtc ttt 
Val Asp Ala He Val Phe 
543 550 



425 

ate ggt egg tat tgg aat 
He Gly Arg Tyr Trp Asn 
440 

gga ggg gaa atg ctg act 
Gly Gly Glu Met Leu Thr 
455 460 

ctg acg aag gaa gag gcg 
Leu Thr Lys Glu Glu Ala 
475 

aca tgg aag gat ata aac 
Thr Trp Lys Asp He Asn 
490 

gtg ggt aga gaa att gcc 
Val Gly Arg Glu He Ala 
505 

gaa gcc agt tac age aaa 
Glu Ala Ser Tyr Ser Lys 
520 

gtt gcc aag gca aat gtc 
Val Ala Lys Ala Asn Val 
535 540 



430 

gac acc age tct 1344 

Asp Thr Ser Ser 

445 

gcg ttg gat ttc 1392 
Ala Leu Asp Phe 



gca tct aag ttt 1440 
Ala Ser Lys Phe 
480 

aag gaa ttc ata 1488 
Lys Glu Phe He 
495 

ate aca ttc etc 1536 
He Thr Phe Leu 
510 

act gac gga gac 1584 

Thr Asp Gly Asp 

525 

gtt get etc ttt 1632 
Val Ala Leu Phe 



1650 



<210> 11 
<211> 550 

<212> PRT 

<213> Artificial Sequence 
<400> 11 

Met Ala Thr Asn Gly Val Val He Ser Cys Leu Arg Glu Val Arg Pro 
15 10 15 

Pro Met Thr Lys His Ala Pro Ser Met Trp Thr Asp Thr Phe Ser Asn 
20 25 30 

Phe Ser Leu Asp Asp Lys Glu Gin Gin Lys Cys Ser Glu Thr He Glu 
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35 40 45 

Ala Leu Lys Gin Giu Ala Arg Gly Met Leu Met Ala Ala Tlir Thr Pro 
SO SS 60 

Leu Gin Gin Met Thr Leu He Asp Thr Leu Glu Arg Leu Gly Leu Ser 
65 70 75 80 

Phe His Phe Glu Thr Glu He Glu Tyr Lys He Glu Leu He Asn Ala 
85 90 95 

Ala Glu Asp Asp Gly Phe Asp Leu Phe Ala Thr Ala Leu Arg Phe Arg 
100 105 110 

Leu Leu Arg Gin His Gin Arg His Val Ser Cys Asp Val Phe Asp Lys 
115 120 125 

Phe He Asp Lys Asp Gly Lys Phe Glu Glu Ser Leu Ser Asn Asn Val 
130 135 140 

Glu Gly Leu Leu Ser Leu Tyr Glu Ala Ala His Val Gly Phe Arg Glu 
145 150 155 160 

Glu. Arg He Leu Gin Glu Ala Val Asn Phe Thr Arg Kis His Leu Glu 
165 170 X75 

Gly Ala Glu Leu Asp Gin Ser Pro Leu Leu Xle Arg Glu Lys Val Lys 
180 185 190 

Arg Ala Leu Glu His Pro Leu His Arg Asp Phe Pro He Val Tyr Ala 
195 200 205 

Arg Leu Phe He Ser He Tyr Glu Lys Asp Asp Ser Arg Asp Glu Leu 
210 215 220 

Leu Leu Lys Leu Ser Lys Val Asn Phe Lys Phe Met Gin Asn Leu Tyr 
225 230 235 240 

Lys Glu Glu Leu Ser Gin Leu Ser Arg Trp Trp Asn Thr Trp Asn Leu 
245 250 255 

Lys Ser Lys Leu Pro Tyr Ala Arg Asp Arg Val Val Glu Ala Tyr Val 
260 265 270 

Trp Gly Val Gly Tyr His Tyr Glu Pro Gin Tyr Ser Tyr Val Arg Met 
275 280 285 

Gly Leu Ala Lys Gly Val Leu He Cys Gly He Met Asp Asp Thr Tyr 
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290 29S 300 

Asp Asn Tyr Ala Thr Leu Asn Glu Ala Gin Leu Phe Thr Gin Val Leu 
305 310 315 320 

Asp Lys Trp Asp Arg Asp Glu Ala Glu Arg Leu Pro Glu Tyr Met Lys 
325 330 335 

He Val Tyr Arg Phe He Leu Ser He Tyr Glu Asn Tyr Glu Arg Asp 
340 345 350 

Ala Ala Lys Leu Gly Lys Ser Phe 7U.a Ala Pro Tyr Phe Lys Glu Thr 
355 360 365 

Val Lys Gin Leu Ala Arg Ala Phe Asn Glu Glu Gin Lys Trp Val Met 
370 375 380 

Glu Arg Gin Leu Pro Ser Phe Gin Asp Tyr Val Lys Asn Ser Glu Lys 
385 390 395 400 

Thr Ser Cys He Tyr Thr Met Phe Ala Ser He He Pro Gly Leu Lys 
405 410 415 

ser Val Thr Gin Glu Thr He Asp Trp He Lys Ser Glu Pro Thr Leu 
420 425 430 

Ala Thr Ser Thr Ala Met He Gly Arg Tyr Trp Asn Asp Thr Ser Ser 
435 440 445 

Gin Leu Arg Glu Ser Lys Gly Sly Glu Met Leu Thr Ala Leu Asp Phe 
450 455 460 

His Met Lys Glu Tyr Gly Leu Thr Lys Glu Glu Ala Ala Ser Lys Phe 
465 470 475 480 

Glu Gly Leu Val Glu Glu Thr Trp Lys Asp He Asn Lys Glu Phe He 
485 490 495 

Ala Thr Thr Asn Tyr Asn Val Gly Arg Glu He Ala He Thr Phe Leu 
500 505 510 

Asn Tyr Ala Arg He Cys Glu Ala Ser Tyr Ser Lys Thr Asp Gly Asp 
515 520 525 

Ala Tyr Ser Asp Pro Asn Val Ala Lys Ala Asn Val Val Ala Leu Phe 
530 535 540 

Val Asp Ala He Val Phe 
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S45 550 



<210> 12 
<211> 1650 
<212> DN2\ 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: nucleic acid 
sequence encoding E-beta-f arnesene synthase 

<220> 

<22X> CDS 

<222> (1) . . (1650} 

<223> Computer-generated nucleic acid sequence encoding 
peppermint E-beta-farnesene synthase protein 

<400> 12 

atg get ggg aac ggc gtc gta att agt tge tta agg gaa gta agg cca 4 8 
Met Ala Gly Asn Gly Val Val lie Ser Cys Leu Arg Glu Val Arg Pro 
15 10 15 

cct atg acg aag cat gcg cca age atg tgg act gat acc ttt tct aac 96 
Pro Met Thr iys His Ala Pro Ser Met Trp Thr Asp Thr Phe Ser Asn 
20 25 30 

ttt tct ctt gac gat aag gaa caa caa aag tgc tea gaa acc ate gaa 144 
Phe Ser Leu Asp Asp Lys Glu Gin Gin Lys Cys Ser Glu Thr lie Glu 
35 40 45 

gca ctt aag caa gaa gca aga ggc atg ctt atg get gca acc act cct 192 
Ala Leu Lys Gin Glu Ala Arg Gly Met Leu Met Ala Ala Thr Thr Pro 
50 55 60 

etc caa caa atg aca eta ate gac act etc gag cgt ttg gga ttg tct 24 0 
Leu Gin Gin Met Thr Leu lie Asp Thr Leu Glu Arg Leu Gly Leu Ser 
^5 70 75 80 

ttc cat ttt gag acg gag ate gaa tac aaa ate gaa eta ate aac get 288 
Phe His Phe Glu Thr Glu lie Glu Tyr Lys lie Glu Leu lie Asn Ala 
85 90 95 



gca gaa gac gac ggc ttt gat ttg ttc get act get ctt cgt ttc cgt 
Ala Glu Asp Asp Gly Phe Asp Leu Phe Ala Thr Ala Leu Arg Phe Arg 
100 105 110 



336 



ttg etc aga caa cat caa cgc cac gtt tct tgt gat gtt ttc gac aag 384 
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lieu Leu Arg Gin His Gin Arg His Val Ser Cys Asp Val Phe Asp Lys 
lis 120 125 

tte ate gac aaa gat ggc aag ttc gaa gaa tec ctt age aat aat gtt 432 
Fhe lie Asp Lys Asp Gly Lys Phe Glu Glu Ser Leu Ser Asn Asn Val 
130 135 140 

gaa ggc eta tta age ttg tat gaa gca get cat gtt ggg ttt cgc gaa 480 
Glu Gly Leu Leu Ser Leu Tyr Glu Ala Ala His Val Gly Phe Arg Glu 

150 155 160 

gaa aga ata tta caa gag get gta aat ttt acg agg cat cac ttg gaa 528 
Glu Arg He Leu Gin Glu Ala Val Asn Phe Thr Arg His His Leu Glu 
165 170 175 

gga gca gag tta gat cag tct cca tta ttg att aga gag aaa gtg aag 576 
Gly Ala Glu Leu Asp Gin Ser Pro Leu Leu He Arg Glu Lys Val Lys 
180 185 190 

cga get ttg gag cac ect ctt cat agg gat ttc ccc att gtc tat gca 624 
Arg Ala Leu Glu His Pro Leu His Arg Asp Phe Pro He Val Tyr Ala 
195 200 205 

cgc ctt ttc ate tec att tac gaa aag gat gac tct aga gat gaa tta 672 
Arg Leu Phe Xle Ser He Tyr Glu Lys Asp Asp Ser Arg Asp Glu Leu 
210 215 220 

^tt etc aag eta tec aaa gtc aac ttc aaa ttc atg cag aat ttg tat 720 
Leu Leu Lys Leu Ser Lys Val Asn Phe Lys Phe Met Gin Asn Leu Tyr 
225 230 235 240 

aag gaa gag etc tec caa etc tec agg tgg tgg aac aca tgg aat ctg 768 
Lys Glu Glu Leu Ser Gin Leu Ser Arg Trp Trp Asn Thr Trp Asn Leu 
245 250 255 

aaa tea aaa tta cca tat gca aga gat cga gtc gtg gag get tat gtt 816 
Lys Ser Lys Leu Pro Tyr Ala Arg Asp Arg Val Val Glu Ala Tyr Val 
260 265 270 

tgg gga gta ggt tac cat tac gaa ccc caa tac tea tat gtt cga atg 864 
Trp Gly Val Gly Tyr His Tyr Glu Pro Gin Tyr Ser Tyr Val Arg Met 
275 280 285 

gga ctt gcc aaa ggc gta eta att tgt gga ate atg gac gat aca tat 912 
Gly Leu Ala Lys Gly Val Leu He Cys Gly He Met Asp Asp Thr Tyr 
290 295 300 

gat aat tat get aca etc aat gaa get caa ctt ttt act caa gtc tta 960 

31 

SUBSTITUTE SHEET (RULE 26) 



BNSOOCID: <WO_ 



_99ian8AlJ_> 



wo 99/18118 



PCT/US9S/20885 



Asp Asn Tyr Ala Thr Leu Asn Glu Ala Gin Leu Phe Thr Gin Val Leu 
305 310 315 320 

gac aag tgg gat aga gat gaa get gaa ega etc cca gaa tac atg aaa 1008 
Asp Lys Trp Asp Arg Asp Glu Ala Glu Arg Leu Pro Glu Tyr Met Lys 
325 330 335 

ate gtt tat cga ttt att ttg agt ata tat gaa aat tat gaa cgt gat 1056 
lie val Tyr Arg Phe lie Leu Ser lie Tyr Glu Asn Tyr Glu Arg Asp 
340 345 350 

gca gcg aaa ctt gga aaa age ttt gca get cct tat ttt aag gaa acc 1104 
Ala Ala Lys Leu Gly Lys Ser Phe Ala Ala Pro Tyr Phe Lys Glu Thr 
355 360 365 

gtg aaa caa ctg gca agg gca ttt aat gag gag cag aag tgg gtt atg 1152 
Val Lys Gin Leu Ala Arg Ala Phe Asn Glu Glu Gin Lys Trp Val Met 
370 375 380 

gaa agg cag eta ccg tea ttc caa gac tac gta aag aat tea gag aaa 1200 
Glu Arg Gin Leu Pro Ser Phe Gin Asp Tyr Val Lys Asn Ser Glu Lys 
385 390 395 400 

acc age tgc att tat acc atg ttt get tct ate ate cca ggc ttg aaa 1248 
Thr Ser Cys lie Tyr Thr Met Phe Ala Ser He He Pro Gly Leu Lys 
405 410 415 

tct gtt acc caa gaa acc att gat tgg ate aag agt gaa ccc acg etc 1296 
Sex Val Thr Gin Glu Thr He Asp Trp He Lys Ser Glu Pro Thr 3Leu 
420 425 430 

gca aca teg acc get atg ate ggt egg tat tgg aat gac acc age tct 1344 
Ala Thr Ser Thr Ala Met He Gly Arg Tyr Trp Asn Asp Thr Ser Ser 
435 440 445 

cag etc cgt gaa age aaa gga ggg gaa atg ctg act gcg ttg gat ttc 1392 
Gin Leu Arg Glu Ser Lys Gly Gly Glu Met Leu Thr Ala Leu Asp Phe 
450 455 



cac atg aaa gaa tat ggt ctg acg aag gaa gag gcg gca tct aag ttt 
His Met Lys Glu Tyr Gly Leu Thr Lys Glu Glu Ala Ala Ser Lys Phe 

470 475 480 



1440 



gaa gga ttg gtt gag gaa aca tgg aag gat ata aac aag gaa ttc ata 1438 
Glu Gly Leu Val Glu Glu Thr Trp Lys Asp He Asn Lys Glu Phe He 
485 490 495 

gcc aca act aat tat aat gtg ggt aga gaa att gee ate aca ttc etc 1536 
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Ala Thr Thr Asn Tyr Asn Val Gly Arg Glu lie Ala lie Thr Phe Leu 
500 505 510 

aac tac get egg ata tgt gaa gcc agt tac age aaa act gac gga gac 1584 
Asn Tyr TUa Arg lie Cys Glu Ala Ser Tyr Ser Lys Thr Asp Gly Asp 
515 520 525 

get tat tea gat cct aat gtt gcc aag gca aat gtc gtt get etc ttt 1632 
Ala Tyr Ser Asp Pro Asn Val Ala Lys Ala Asn Val Val Ala Leu Phe 
530 535 540 

gtt gat gcc ata gtc ttt 1650 
Val Asp Ala lie Val Phe 
545 550 



<210> 13 
<211> 550 
<212> PRT 

<213> Artificial Sequence 
<400> 13 

Met Ala Gly T^n Gly Val Val lie Ser cys Leu Arg Glu Val Arg Pro 
^5 10 15 

Pro Met Thr Lys His Ala Pro Ser Met Trp Thr Asp Thr Phe ser Asn 
20 25 30 

Phe Ser Leu Asp Asp Lys Glu Gin Gin Lys Cys Ser Glu Thr lie Glu 
35 40 45 

Ala Leu Lys Gin Glu Ala Arg Gly Met Leu Met Ala Ala Thr Thr Pro 
50 55 60 

Leu Gin Gin Met Thr Leu lie Asp Thr Leu Glu Arg Leu Gly Leu Ser 
65 70 75 80 

Phe His Phe Glu Thr Glu lie Glu Tyr Lys lie Glu Leu He Asn Ala 
85 90 95 

Ala Glu Asp Asp Gly Phe Asp Leu Phe Ala Thr Ala Leu Arg Phe Arg 
100 105 110 

Leu Leu Arg Gin His Gin Arg His Val Ser Cys Asp Val Phe Asp Lys 
115 120 125 

Phe He Asp Lys Asp Gly Lys Phe Glu Glu Ser Leu Ser Asn Asn Val 
130 135 140 
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Glu Gly Leu Leu Ser Leu Tyr Glu Ala Ala His Val Gly Phe Arg Glu 
145 150 155 160 

Glu Arg lie Leu Gin Glu Ala Val Asn Phe Thr Arg His His Leu Glu 
165 170 175 

Gly Ala Glu Leu Asp Gin Ser Pro Leu Leu lie Arg Glu Lys Val Lys 
180 185 190 

Arg Ala Leu Glu His Pro Leu His Arg Asp Phe Pro lie Val Tyr Ala 
195 200 205 

Arg Leu Phe He Ser He Tyr Glu Lys Asp Asp Ser Arg Asp Glu Leu 
210 215 220 

Leu Leu Lys Leu Ser Lys Val Asn Phe Lys Phe Met Gin Asn Leu Tyr 
225 230 235 240 

Lys Glu Glu Leu Ser Gin Leu Ser Arg Trp Trp Asn Thr Trp Asn Leu 
245 250 255 

Lys Ser Lys Leu Pro Tyr Ala Arg Asp Arg Val Val Glu Ala Tyr Val 
260 265 270 

Trp Gly Val Gly Tyr His Tyr Glu Pro Gin Tyr Ser Tyr Val Arg Met 
275 280 285 

Gly Leu Ala Lys Gly Val Leu He Cys Gly He Met Asp Asp Thr Tyr 
290 295 300 

Asp Asn Tyr Ala Thr Leu J^n Glu Ala Gin Leu Phe Thr Gin Val Leu 

310 315 320 

Asp Lys Trp Asp Arg Asp Glu Ala Glu Arg Leu Pro Glu Tyr Met Lys 
325 330 335 

He Val Tyr Arg Phe He Leu Ser He Tyr Glu Asn Tyr Glu Arg Asp 
340 345 350 

Ala Ala Lys Leu Gly Lys Ser Phe Ala Ala Pro Tyr Phe Lys Glu Thr 
355 360 365 

Val Lys Gin Leu Ala Arg Ala Phe Asn Glu Glu Gin Lys Trp Val Met 
370 375 3Q0 

Glu Arg Gin Leu Pro Ser Phe Gin Asp Tyr Val Lys Asn Ser Glu Lys 

390 395 400 
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Thr Ser Cys lie Tyr Thr Met Phe Ala Ser lie lie Pro Gly Leu Lys 
405 410 415 

Sex Val Thr Gin Glu Thr lie Asp Trp He Lys Ser Glu Pro Thr Leu 
420 425 430 

Ala Thr Ser Thr Ala Met He Gly Arg Tyr Trp Asn Asp Thr Ser Ser 
435 440 445 

Gin lieu Arg Glu Ser Lys Gly Gly Glu Met Leu Thr Ala Leu Asp Phe 
450 435 460 

His Met Lys Glu Tyr Gly Leu Thr Lys Glu Glu Ala Ala Ser Lys Phe 
465 470 475 480 

Glu Gly Leu Val Glu Glu Thr Trp Lys Asp He Asn Lys Glu Phe He 
485 490 495 

Ala Thr Thr Asn Tyr Asn Val Gly Arg Glu He Ala He Thr Phe Leu 
500 505 510 

Asn Tyr Ala Arg He Cys Glu Ala Ser Tyr Ser Lys Thr Asp Gly Asp 
515 520 525 

Ala Tyr Ser Asp Pro Asn Val Ala Lys Ala Asn Val Val Ala Leu Phe 
530 535 540 

Val Asp Ala He Val Phe 
545 550 



<210> 14 
<211> 1650 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: nucleic acid 
sequence encoding E-beta-£arnesene synthase 
protein 

<220> 

<221> CDS 

<222> {IJ . . (1650) 

<223> Con^uter-generated nucleic acid secjuence encoding 
peppermint E-beta-famesene synthase protein 
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<400> 14 

atg get aca aac ggc gtc gta att agt tgc tta agg gaa gta agg cca 48 
Met AXa Thr Asn Gly Val Val He Ser Cys Leu Arg BXu Val Arg Pro 
^5 10 15 



cct atg acg aag cat gcg cca age atg tgg act gat acc ttt tct aac 96 
Pro Met Thr Lys His Ala Pro Ser Met Trp Thr Asp Thr Phe Ser Asn 
20 25 30 



ttt tct ctt gac gat aag gaa caa caa aag tgc tea gaa acc ate gaa 144 
Phe Ser Leu Asp Asp Lys Glu Gin Gin Lys Cys Ser Glu Thr He Glu 
35 40 45 



gca ctt aag caa gaa gca aga ggc atg ctt atg get gca acc act cct 192 
Ala Leu Lys Gin Glu Ala Arg Gly Met Leu Met Ala Ala Thr Thr Pro 
50 55 go 



etc caa caa atg aca eta ate gac act etc gag cgt ttg gga ttg tct 240 
Leu Gin Gin Met Thr Leu He Asp Thr Leu Glu Arg Leu Gly Leu Ser 
^5 70 75 so 

ttc cat ttt gag acg gag ate gaa tac aaa ate gaa eta ate aac get 28 S 
Phe His Phe Glu Thr Glu He Glu Tyr Lys He Glu Leu He Asn Ala 
85 90 95 



gca gaa gac gac ggc ttt gat ttg ttc get act get ctt cgt ttc cgt 336 
Ala Glu Asp Asp Gly Phe Asp Leu Phe Ala Thr Ala Leu Arg Phe Arg 
100 105 110 

ttg etc aga caa eat caa cgc cac gtt tct tgt gat gtt ttc gae aag 384 
Leu Leu Arg Gin His Gin Arg His Val Ser Cys Asp Val Phe Asp Lys 
115 120 125 

ttc ate gac aaa gat ggc aag ttc gaa gaa tec ctt age aat aat gtt 432 
Phe He Asp Lys Asp Gly Lys Phe Glu Glu Ser Leu Ser Asn Asn Val 

135 140 



gaa ggc eta tta age ttg tat gaa gca get cat gtt ggg ttt cgc gaa 480 
Glu Gly Leu Leu Ser Leu Tyr Glu Ala Ala His Val Gly Phe Arg Glu 
145 150 155 160 



gaa aga ata tta caa gag get gta aat ttt acg agg cat cac ttg gaa 528 
Glu Arg He Leu Gin Glu Ala Val Asn Phe Thr Arg His His Leu Glu 
1€S 170 175 



gga gca gag tta gat cag tct cca tta ttg att aga gag aaa gtg aag 576 
Gly Ala Glu Leu Asp Gin Ser Pro Leu Leu He Arg Glu Lys Val Lys 
180 185 190 
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cga get ttg gag cac cct ctt cat agg gat ttc ccc att gtc tat gca 
Arg Ala Leu Glu His Pro L^u His Axg Asp Phe Pro Xle Val Tyr Ala 
195 2O0 205 



624 



cgc ctt ttc ate tec att tac gaa aag gat gac tct aga gat gaa tta 
Arg Leu Phe lie Ser lie Tyr Glu Lys Asp Asp Ser Arg Asp Glu Leu 
210 215 220 



€72 



ctt etc aag eta tec aaa gtc aac ttc aaa ttc atg cag aat ttg tat 
Leu Leu Lys Leu Ser Lys Val Asn Phe Lys Phe Met Gin Asn Leu Tyr 
225 230 235 240 



720 



aag gaa gag etc tec caa etc tec agg tgg tgg aac aca tgg aat ctg 
Lys Glu Glu Leu Ser Gin Leu Ser Arg Trp Trp Asn Thr Trp Asn Leu 
245 250 255 



768 



aaa tea aaa tta ccc tat gca aga gat cga gtc gtg gag get tat gtt 816 

Lys Ser Lys Leu Pro Tyr JUa Arg Asp Arg Val Val Glu Ala Tyr Val 
260 265 270 

tgg gga gta ggt tac cat tac gaa ccc caa tac tea tat gtt cga atg 864 

Trp Gly val Gly Tyr His Tyr Glu Pro Gin Tyr Ser Tyr Val Arg Met 
275 280 285 



gga ctt gee aaa ggc gta eta att tgt gga ate atg gac gat aca tat 
Gly Leu Ala Lys Gly Val Leu lie cys Gly lie Met Asp Asp Thr Tyr 
250 295 300 



912 



gat aat tat get aca etc aat gaa get caa ctt ttt act caa gtc tta 960 
Asp Asn Tyr Ala Thr Leu Asn Glu Ala Gin Leu Phe Thr Gin Val Leu 
305 310 315 320 

gac aag tgg gat aga gat gaa get gaa cga etc cca gaa tac atg aaa 1008 
Asp Lys Trp Asp Arg Asp Glu Ala Glu Arg Leu Pro Glu Tyr Met Lys 
325 330 335 

ate gtt tat cga ttt att ttg agt ata tat gaa aat tat gaa cgt gat 1056 
lie Val Tyr Arg Phe He Leu Ser He Tyr Glu Asn Tyr Glu Arg Asp 
340 345 350 

gca gcg aaa ctt gga aaa age ttt gca get cct tat ttt aag gaa acc 1104 
Ala Ala Lys Leu Gly Lys Ser Phe Ala Ala Pro Tyr Phe Lys Glu Thr 
355 360 365 

gtg aaa caa ctg gca agg gca ttt aat gag gag cag aag tgg gtt atg 1152 
Val Lys Gin Leu Ala Arg Ala Phe Asn Glu Glu Gin Lys Trp Val Met 
370 375 
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gaa agg cag eta ccg tea ttc caa gac tac gta aag aat tea gag aaa 1200 
Glu Arg Gin Leu Pro Ser Phe Gin Asp Tyr Val Lys Asn Ser Glu Lys 

390 395 400 

acc age tgc att tat acc atg ttt get tct ate ate cca gge ttg aaa 1248 
Thr Ser cys lie Tyr Thr Met Phe Ala Ser lie lie Pro Gly Leu Lys 
405 410 415 



tct gtt acc caa gaa acc att gat tgg ate aag agt gaa ccc acg etc 
Ser val Thr Gin Glu Thr He Asp Trp He Lys Ser Glu Pro Thr Leu 
420 425 430 



1296 



gca aca teg acc get atg ate ggt egg tat tgg aat gac acc age tct 1344 
Ala Thr Ser Thr Ala Met He Gly Arg Tyr Trp Asn Asp Thr Ser Ser 
435 440 44S 

cag etc cgt gaa age aaa gga ggg gaa atg ctg act gcg ttg gat ttc 1392 
Gin Leu Arg Glu Ser Lys Gly Gly Glu Met Leu Thr Ala Leu Asp Phe 
450 455 460 

cac atg aaa gaa tat ggt ctg acg aag gaa gag gcg gca tct aag ttt 1440 
His Met Lys Glu Tyr Gly Leu Thr Lys Glu Glu Ala Ala Ser Lys Phe 

470 475 480 

gaa gga ttg gtt gag gaa aca tgg aag gat ata aac aag gaa ttc ata 1488 
Glu Gly Leu Val Glu Glu Thr Trp Lys Asp He Asn Lys Glu Phe He 
485 490 495 

gcc aca act aat tat aat gtg ggt aga gaa att gee ate aca ttc etc 1536 
Ala Thr Thr Asn Tyr Asn Val Gly Arg Glu He Ala He Thr Phe Leu 
500 505 510 



aac tac get egg ata tgt gaa gcc agt tac age aaa act gac gga gac 
Asn Tyr Ala Arg He Cys Glu Ala Ser Tyr Ser Lys Thr Asp Gly Asp 
515 520 525 



1584 



get tat tea gat ect aat gtt gcc aag gca aat gtc gtt get etc ttt 
Ala Tyr Ser Asp Pro Asn Val Ala Lys Ala Asn Val Val Ala Leu Phe 
530 535 540 



1632 



gtt gat gcc ata gtc ttt 
Val Asp Ala He Val Phe 
S45 550 



1650 



<210> 15 
<211> 550 
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<212> PRT 

<213> Artificial Sequence 
<400> 15 

Met Ala Thr Asn Gly Val Val lie Ser Cys I,eu Arg Glu Val Arg Pro 
15 10 15 

Pro Met Thr Lys His Ala Pro Ser Met Trp Thr Asp Thr Phe Ser Asn 

20 25 ao 

Phe Ser Leu Asp Ajsp Lys Glu Gin Gin Lys Cys Ser Glu Thr lie Glu 
35 40 45 

Ala Leu Lys Gin Glu Ala Arg Gly Met Leu Met Ala Ala Thr Thr Pro 
50 55 60 

Leu Gin Gin Met Thr Leu lie Asp Thr Leu Glu Arg Leu Gly Leu Ser 
^5 70 75 SO 

Phe His Phe Glu Thr Glu lie Glu Tyr Lys He Glu Leu He Asn Ala 
85 90 95 

Ala Glu Asp Asp Gly Phe Asp Leu Phe Ala Thr Ala Leu Arg Phe Arg 
100 105 110 

Leu Leu Arg Gin His Gin Arg His Val Ser Cys Asp Val Phe Asp Lys 
115 120 125 

Phe He Asp Lys Asp Gly Lys Phe Glu Glu Ser Leu Ser Asn Asn Val 
130 135 140 

Glu Gly Leu Leu Ser Leu Tyr Glu Ala Ala His Val Gly phe Arg Glu 
"5 ISO 155 160 

Glu Arg rie Leu Gin Glu Ala Val Asn Phe Thr Arg His His Leu Glu 
165 170 175 

Gly Ala Glu Leu Asp Gin Ser Pro Leu Leu He Arg Glu Lys Val Lys 
ISO 185 190 

Arg Ala Leu Glu His Pro Leu His Arg Asp Phe Pro He Val Tyr Ala 
195 200 205 

Arg Leu Phe He Ser He Tyr Glu Lys Asp Asp Ser Arg Asp Glu Leu 
210 215 220 

Leu Leu Lys Leu Ser Lys Val Asn Phe Lys Phe Met Gin Asn Leu Tyr 
225 230 235 240 
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Lys Glu Glu 3[*eu Ser Gin Leu Ser Arg Trp Trp Asn Thr Trp Asn I*eu 
245 25D 255 

Lys Ser Lys Leu Pre Tyr Ala Arg Asp Arg val Val Glu Ala Tyr Val 
260 265 270 

Trp Gly Val Gly Tyr His Tyr Glu Pro Gin Tyr Ser Tyr Val Arg Met 
275 280 285 

Gly Leu Ala Lys Gly Val Leu lie Cys Gly He Met Asp Asp Thr Tyr 
290 295 300 

Asp Asn Tyr Ala Thr Leu Asn Glu Ala Gin Leu Phe Thr Gin Val Leu 
305 310 315 320 

Asp Lys Trp Asp Arg Asp Glu Ala Glu Arg Leu Pro Glu Tyr Met Lys 
325 330 335 

He Val Tyr Arg Phe He Leu Ser He Tyr Glu Asn Tyr Glu Arg Asp 
340 345 350 

Ala Ala Lys Leu Gly Lys Ser Phe Ala Ala Pro Tyr Phe Lys Glu Thr 
355 360 365 

Val Lys Gin Leu Ala Arg Ala Phe Asn Glu Glu Gin Lys Trp Val Met 
370 375 380 

Glu Arg Gin Leu Pro Ser Phe Gin Asp Tyr Val Lys Asn Ser Glu Lys 
385 390 395 400 

Thr Ser Cys He Tyr Thr Met Phe Ala Ser He He Pro Gly Leu Lys 
405 410 415 

Ser Val Thr Gin Glu Thr He Asp Trp He Lys Ser Glu Pro Thr Leu 
420 425 430 

Ala Thr Ser Thr Ala Met He Gly Arg Tyr Trp Asn Asp Thr Ser Ser 
435 440 445 

Gin Leu Arg Glu Ser Lys Gly Gly Glu Met Leu Thr Ala Leu Asp Phe 
450 455 460 

His Met Lys Glu Tyr Gly Leu Thr Lys Glu Glu Ala Ala Ser Lys Phe 
465 470 475 480 

Glu Gly Leu Val Glu Glu Thr Trp Lys Asp He Asn Lys Glu Phe He 
485 490 495 
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Ala Thr Thr Asn Tyr Asn Val Gly 
500 

Asn Tyr Ala Airg lie cys Glu Ala 

515 520 

Ala Tyr Ser Asp Pro Asn Val Ala 
530 535 

Val AiSp Ala lie Val Phe 
545 550 



Arg Glu lie Ala lie Thr Phe Leu 

505 510 

Ser Tyr Ser Lys Thr Asp Gly Asp 
525 

Lys Ala Asn Val Val Ala Leu Phe 
540 



<210> 16 
<211> 1650 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: nucleic acid 
sequence encoding E-beta-famesene synthase 

<220> 

<221> CDS 

<222> (1) , . (1650) 

<223> Coinputer-generated nucleic acid sequence encoding 
peppermint E^-beta-farnesene synthase protein 

<400> 16 

atg get aca aac ggc gtc gta att agt tgc tta agg gaa gta agg cca 48 

Met Ala Thr Asn Gly Val Val lie Ser Cys Leu Arg Glu Val Arg Pro 

15 10 15 

cct atg acg aag cat gcg cca age atg tgg act gat acc ttt tct aac 96 
Pro Met Thr Lys His Ala Pro Ser Met Trp Thr Asp Thr Phe Ser Asn 
20 25 30 

ttt tct ctt gac gat aag gaa caa caa aag tgc tea gaa acc ate gaa 144 
Phe Ser Leu Asp Asp Lys Glu Gin Gin Lys Cys Ser Glu Thr He Glu 
35 40 45 

gca ctt aag caa gaa gca aga ggc atg ctt atg get gca acc act cct 192 
Ala Leu Lys Gin Glu Ala Arg Gly Met Leu Met Ala Ala Thr Thr Pro 
50 55 60 

etc caa caa atg aca eta ate gac act etc gag cgt ttg gga ttg tct 240 
* Leu Gin Gin Met Thr Leu He Asp Thr Leu Glu Arg Leu Gly Leu Ser 
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65 70 75 



80 



ttc cat: ttt gag acg gag ate gaa tac aaa ate gaa eta ate aac get 
Phe His Phe Glu Thr Glu lie Glu Tyr Lys lie Glu Leu lie Asn TUa 

85 90 



95 



110 



ttg crc aga caa cat caa cgc cac gtt tct tgt gat gtt ttc gac aag 
Leu Leu Arg Gin His Gin Arg His Val Ser Cys Asp Val Phe Tlsp Lys 
115 120 125 

ttc ate gac aaa gat ggc aag ttc gaa gaa tec ctt age aat aat gtt 
Phe He Asp Lys Asp Gly Lys Phe Glu Glu Ser Leu Ser Asn Asn Val 
130 135 140 

gaa ggc eta tta age ttg tat gaa gca get cat gtt ggg ttt cgc gaa 
Glu Gly Leu Leu Ser Leu Tyr Glu Ala Ala His Val Gly Phe TUrg Glu 

150 155 

gaa aga ata tta caa gag get gta aat ttt acg agg cat cac ttg gaa 
Glu Arg He Leu Gin Glu Ala Val Asn Phe Thr Arg His His Leu Glu 
165 170 175 



cga get ttg gag cac cct ctt cat agg gat ttc ccc att gtc tat gca 
Arg Ala Leu Glu His Pro Leu His Arg Asp Phe Pro He Val Tyr Ala 
195 200 205 



288 



gca gaa gac gac ggc ttt gat ttg ttc get act get ctt cgt ttc cgt 336 
Ala Glu Asp Asp Gly Phe Asp Leu Phe Ala Thr Ala l.eu Arg phe Arg 
100 105 



384 



432 



480 



528 



gga gca gag tta gat cag tct cca tta ttg att aga gag aaa gtg aag 576 
Gly Ala Glu Leu Asp Gin Ser Pro Leu Leu lie Arg Glu Lys Val Lys 
180 185 190 



624 



cgc ctt ttc ate tec att tac gaa aag gat gac tct aga gat gaa tta 672 
Arg Leu Phe He Ser He Tyr Glu Lys Asp Asp Ser Arg Asp Glu Leu 
210 215 220 

ctt etc aag eta tec aaa gtc aac tte aaa ttc atg cag aat ttg tat 720 
Leu Leu Lys Leu Ser Lys Val Asn Phe Lys Phe Met Gin Asn Leu Tyr 

230 235 240 

aag gaa gag etc tec caa etc tec agg tgg tgg aac aca tgg aat ctg 768 
Lys Glu Glu T,eu Ser Gin Leu Ser Arg Trp Trp Asn Thr Trp Asn Leu 
245 250 255 

aaa tea aaa tta cca tat gca aga gat cga gtc gtg gag get tat gtt 816 
Lys Ser Lys Leu Pro Tyr Ala Arg Asp Arg Val Val Glu Ala Tyr Val 
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260 265 270 

tgg gga gta ggt tac cat. tac gaa ccc caa tac tea tat gtt cga atg 864 
Trp Gly Val Gly Tyr His Tyr Glu Pro Gin Tyr Ser Tyr Val Arg Met 
275 280 285 

gga ctt gcc aaa ggc gta eta att tgt gga ate atg gac gat aca tat 912 
Gly Leu Ala Lys Gly Val Leu lie Cys Gly lie Met Asp Asp Thr Tyr 
290 295 300 

gat aat tat get aca etc aat gaa get caa ctt ttt act caa gtc tta 960 
Asp Asn Tyr Ala Thr Leu Asn Glu Ala Gin Leu Phe Thr Gin Val Leu 
305 310 315 320 

gac aag tgg gat aga gat gaa get gaa cga etc cca gaa tac atg aaa 1008 
Asp Lys Trp Asp Arg Asp Glu Ala Glu Arg Leu Pro Glu Tyr Met Lys 
325 330 335 



ate gtt tat cga ttt att ttg agt ata tat gaa aat tat gaa cgt gat 
He Val Tyr Arg Phe He Leu Ser lie Tyr Glu Asn Tyr Glu Arg Asp 
340 345 350 



1056 



gca gcg aaa ctt gga aaa age ttt gca get cct tat ttt aag gaa ace 1104 
Ala Ala Lys Leu Gly Lys Ser Phe Ala Ala Pro Tyr Phe Lys Glu Thr 
355 360 365 

gtg aaa caa ctg gca agg gca ttt aat gag gag cag aag tgg gtt atg 1152 
Val Lys Gin Leu Ala Arg Ala Phe Asn Glu Glu Gin Lys Trp Val Met 
370 375 380 

gaa agg cag eta ccg tea tte caa gae tac gta aag aat acg gag aaa 1200 
Glu Arg Gin Leu Pro Ser Phe Gin Asp Tyr Val Lys Asn Thr Glu Lys 
385 390 395 400 

ace age tge att tat acc atg ttt get tet ate ate cca ggc ttg aaa 1248 
Thr Ser Cys He Tyr Thr Met Phe Ala Ser lie He Pro Gly Leu Lys 
405 410 415 

tct gtt ace caa gaa acc att gat tgg ate aag agt gaa ccc acg etc 1296 
Ser Val Thr Gin Glu Thr He Asp Trp He Lys Ser Glu Pro Thr Leu 
420 425 430 

gca aca teg acc get atg ate ggt egg tat tgg aat gac acc age tet 1344 
Ala Thr ser Thr Ala Met He Gly Arg Tyr Trp Asn Asp Thr Ser Ser 
435 440 445 

cag etc cgt gaa age aaa gga ggg gaa atg ctg act gcg ttg gat tte 1392 
Gin Leu Arg Glu Ser Lys Gly Gly Glu Met Leu Thr Ala Leu Asp Phe 
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450 455 460 

cac atg aaa gaa tat ggt ctg acg aag gaa gag gcg gca tct aag ttt 1440 
His Met Lys Glu Tyr Gly Leu Thr Lys Glu Glu Ala Ala Ser Lys Phe 
465 470 475 480 

gaa gga ttg gtt gag gaa aca tgg aag gat ata aac aag gaa ttc ata 1488 
Glu Gly Leu Val Glu Glu Thr Trp Lys Asp He Asn Lys Glu Phe He 
485 490 495 

gcc aca act aat tat aat gtg ggt aga gaa att gcc ate aca ttc etc 1536 
Ala Thr Thr Asn Tyr Asn Val Gly Arg Glu He Ala He Thr Phe Leu 
500 505 510 

aac tac get egg ata tgt gaa gcc agt tac age aaa act gac gga gac 15 84 
Asn Tyr Ala Arg He cys Glu Ala Ser Tyr Ser Lys Thr Asp Gly Asp 
515 520 525 

get tat tea gat cct aat gtt gcc aag gca aat gtc gtt get etc ttt 1632 
Ala Tyr Ser Asp Pro Asn Val Ala Lys Ala Asn Val Val Ala Leu Phe 
530 535 540 

gtt gat gcc ata gtc ttt 1650 
Val Asp Ala He Val Phe 
545 550 



<210> 17 
<211> 550 
<212> PRT 

<213> Artificial Sequence 
<400> 17 

Met Ala Thr Asn Gly Val Val He Ser Cys Leu Arg Glu Val Arg Pro 
15 10 15 

Pro Met Thr Lys His Ala Pro Ser Met Trp Thr Asp Thr Phe Ser Asn 
20 25 30 

Phe Ser Leu Asp Asp Lys Glu Gin Gin Lys Cys Ser Glu Thr He Glu 
35 40 45 

Ala Leu Lys Gin Glu Ala Arg Gly Met Leu Met Ala Ala Thr Thr Pro 
50 55 60 

Leu Gin Gin Met Thr Leu He Asp Thr Leu Glu Arg Leu Gly Leu Ser 

70 75 80 
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Phe His Phe Glu Thr Glu He Glu Tyr Lys He Glu Leu He Asn Ala 
85 90 95 

Ala Glu Asp Asp Gly Phe Asp Leu Phe Ala Thr Ala Leu Arg Phe Arg 
100 105 110 

Leu Leu Arg Gin His Gin Arg His Val Ser Cys Asp Val Phe Asp Lys 
115 120 125 

Phe He Asp Lys Asp Gly Lys Phe Glu Glu Ser Leu Ser Asn Asn Val 
130 135 140 

Glu Gly Leu Leu Ser Leu Tyr Glu Ala Ala His Val Gly Phe Arg Glu 
145 150 155 160 

Glu Arg lie Leu Gin Glu Ala Val Asn Phe Thr Arg His His Leu Glu 
165 170 175 

Gly Ala Glu Leu Asp Gin Sex Pro Leu Leu He Arg Glu Lys Val Lys 
180 185 190 

Arg Ala Leu Glu His Pro Leu His Arg Asp Phe Pro He Val Tyr Ala 
195 200 205 

Arg Leu Phe He Ser He Tyr Glu Lys Asp Asp Ser Arg Asp Glu Leu 
210 215 220 

Leu Leu Lys Leu Ser Lys Val Asn Phe Lys Phe Met Gin Asn Leu Tyr 
225 230 235 240 

Lys Glu Glu Leu Ser Gin Leu Ser Arg Trp Trp Asn Thr Trp Asn Leu 
245 250 2S5 

Lys Ser Lys Leu Pro Tyr Ala Arg Asp Arg Val Val Glu Ala Tyr Val 
260 265 270 

Trp Gly Val Gly Tyr His Tyr Glu Pro Gin Tyr Ser Tyr Val Arg Met 
275 280 285 

Gly Leu Ala Lys Gly Val Leu He Cys Gly He Met Asp Asp Thr Tyr 
290 295 300 

Asp Asn Tyr Ala Thr Leu Asn Glu Ala Gin Leu Phe Thr Gin Val Leu 
305 310 315 320 

Asp Lys Trp Asp Arg Asp Glu Ala Glu Arg Leu Pro Glu Tyr Met Lys 
325 330 335 
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He Val Tyr Arg Phe He Leu Ser He Tyr Glu Asn Tyr Glu Arg Asp 
340 345 

Ala Ala Lys Leu Gly Lys Ser Phe Ala Ala Pro Tyr Phe Lys Glu Thr 
355 360 365 

Val Lys Gin Leu Ala Arg Ala Phe Asn Glu Glu Gin Lys Trp Val Met 
370 375 380 

Glu Arg Gin Leu Pro Ser Phe Gin Asp Tyr Val Lys Asn Thr Glu Lys 
385 390 395 400 

Thr Ser Cys He Tyr Thr Met Phe Ala ser He He Pro Gly Leu Lys 
405 410 415 

Ser Val Thr Gin Glu Thr He Asp Trp He Lys Ser Glu Pro Thr Leu 
420 425 430 

Ala Thr Ser Thr Ala Met He Gly Arg Tyr Trp Asn Asp Thr Ser Ser 
435 440 445 

Gin Leu Arg Glu Ser Lys Gly Gly Glu Met Leu Thr Ala Leu Asp Phe 
450 455 46Q 

His Met Lys Glu Tyr Gly Leu Thr Lys Glu Glu Ala Ala Ser Lys Phe 
465 470 475 480 

Glu Gly Leu Val Glu Glu Thr Trp Lys Asp He Asn Lys Glu Phe He 
485 490 495 

Ala Thr Thr Asn Tyr Asn Val Gly Arg Glu He Ala He Thr Phe Leu 
500 505 510 

Asn Tyr Ala Arg He Cys Glu Ala Ser Tyr Ser Lys Thr Asp Gly Asp 
515 520 525 

Ala Tyr Ser Asp Pro Asn Val Ala Lys Ala Asn Val Val Ala Leu Phe 
530 535 540 

Val Asp Ala He Val Phe 
545 550 



<210> 18 
<211> 1650 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: nucleic acid 
sequence encoding E-beta-f arnesene synthase 

<220> 

<221> CDS 

<222> <1) . - (1650) 

<223> Computer-generated nucleic acid sequence encoding 
peppentiint E-beta-farnesene synthase protein 

<400> 18 

atg get aca aac ggc gtc gta att agt tgc tta agg gaa gta agg cca 48 

Met Ala Thr Asn Gly Val Val lie Ser Cys Leu Arg Glu Val Arg Pro 
15 10 15 

cct atg acg aag cat gcg cca age atg tgg act gat acc ttt tct aac 96 
Pro Met Thr Lys His Ala Pro Ser Met Trp Thr Asp Thr Phe Ser Asn 
20 25 30 

ttt tct ctt gac gat aag gaa caa caa aag tgc tea gaa acc ate gaa 144 
Phe Ser Leu Asp Asp Lys Glu Gin Gin Lys Cys Ser Glu Thr He Glu 
35 40 45 

gca ctt aag caa gaa gca aga ggc atg ctt atg get gca acc act cct 192 
Ala Leu Lys Gin Glu Ala Arg Gly Met Leu Met Ala Ala Thr Thr Pro 
50 55 60 

etc caa caa atg aca eta ate gac act etc gag cgt ttg gga ttg tct 240 
Leu Gin Gin Met Thr Leu He Asp Thr Leu Glu Arg Leu Gly Leu Ser 
65 70 75 SO 

ttc cat ttt gag acg gag ate gaa tac aaa ate gaa eta ate aac get 288 
Phe His Phe Glu Thr Glu He Glu Tyr Lys He Glu Leu He Asn Ala 
85 90 95 

gca gaa gac gac ggc ttt gat ttg ttc get act get ctt cgt ttc cgt 336 
Ala Glu Asp Asp Gly Phe Asp Leu Phe Ala Thr Ala Leu Arg Phe Arg 
100 105 110 

ttg etc aga caa cat caa cgc cac gtt tct tgt gat gtt ttc gac aag 384 
Leu Leu Arg Gin His Gin Arg His Val Ser Cys Asp Val Phe Asp Lys 
115 120 125 

ttc ate gac aaa gat ggc aag ttc gaa gaa tec ctt age aat aat gtt 432 
Phe He Asp Lys Asp Gly Lys Phe Glu Glu Ser Leu Ser Asn Asn Val 
130 135 140 

gaa ggc eta tta age ttg tat gaa gca get cat gtt ggg ttt cgc gaa 480 
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Glu Gly Leu Leu Ser Leu Tyr Glu Ala Ala His Val Gly Phe Arg Glu 
145 ISO 155 160 

gaa aga ata tta caa gag get gta aat ttt acg agg cat cac ttg gaa 528 
Glu Arg lie Leu Gin Glu Ala Val Asn Phe Thr Arg His His Leu Glu 
165 170 175 

gga gca gag tta gat cag tct cca tta ttg att aga gag aaa gtg aag 576 
Gly Ala Glu Leu Asp Gin Ser Pro Leu Leu lie Arg Glu Lys Val Lys 
180 185 190 

cga get ttg gag cac cct ctt cat agg gat ttc ccc att gtc tat gca €24 
Axg Ala Leu Glu His Pro Leu His Arg Asp Phe Pro He Val Tyr Ala 
195 200 205 

cgc ctt ttc ate tec att tac gaa aag gat gac tct aga gat gaa tta 672 
Arg Leu Phe He Ser He Tyr Glu Lys Asp Asp Ser Arg Asp Glu Leu 

215 220 

etc aag eta tec aaa gtc aac ttc aaa tte atg cag aat ttg tat 720 
Leu Leu Lys Leu Ser Lys Val Asn Phe Lys Phe Met Gin Asn Leu Tyr 
225 230 235 240 

aag gaa gag etc tec caa etc tec agg tgg tgg aac aca tgg aat ctg 768 
Lys Glu Glu Leu Ser Gin Leu Ser Arg Trp Trp Asn Thr Trp Asn Leu 
245 250 255 

aaa tea aaa tta cea tat gca aga gat cga gtc gtg gag get tat gtt 816 
Lys Ser Lys Leu Pro Tyr Ala Arg Asp Arg Val Val Glu Ala Tyr Val 
260 265 270 

'tgg gga gta ggt tac cat tac gaa ccc caa tac tea tat gtt cga atg 8 64 
Trp Gly Val Gly Tyr His Tyr Glu Pro Gin Tyr Ser Tyr Val Arg Met 
275 2S0 285 

gga ctt gcc aaa ggc gta eta att tgt gga ate atg gac gat aca tat 912 
Gly Leu Ala Lys Gly Val Leu He Cys Gly He Met Asp Asp Thr Tyr 
290 295 300 

gat aat tat get aca etc aat gaa get caa ctt ttt act caa gtc tta 960 
Asp Asn Tyr Ala Thr Leu Asn Glu Ala Gin Leu Phe Thr Gin Val Leu 
305 310 315 320 

gac aag tgg gat aga gat gaa get gaa cga etc cca gaa tac atg aaa 1C08 
Asp Lys Trp Asp Arg Asp Glu Ala Glu Arg Leu Pro Glu Tyr Met Lys 
325 330 335 

ate gtt tat cga ttt att ttg agt ata tat gaa aat tat gaa cgt gat 1056 
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lie Val Tyr Arg Phe lie Leu ser lie Tyr <51u Asn Tyr Glu Arg Asp 
340 345 350 



gca gcg aaa ciit. gga aaa age ttt gca get cct. tat ttt aag gaa acc 
Ala Ala Lys Leu Gly Lys Ser Phe Ala Ala Pro Tyr Phe Lys Glu Thr 
355 360 365 



1104 



gtg aaa caa ctg gca agg gca ttt aat gag gag cag aag tgg gtt atg 
Val Lys Gin Leu Ala Arg Ala Phe Asn Glu Glu Gin Lys Trp Val Met 
370 375 380 



1152 



gaa agg cag eta ccg tea ttc caa gac tac gta aag aat tea gag aaa 
Glu Arg Gin Leu Pro Ser Phe Gin Asp Tyr Val Lys Asn Ser Glu Lys 
385 390 395 400 



1200 



acc age tgc att tat acc atg ttt get tet ate ate cca ggc ttg aaa 
Thr ser Cys He Tyr Thr Met Phe Ala Ser He He Pro Gly Leu Lys 
405 410 415 



1248 



tct gtt acc caa gaa acc att gat tgg ate aag agt gaa ccc aeg etc 
Ser Val Thr Gin Glu Thr He Asp Trp He Lys Ser Glu Pro Thr Leu 
420 425 430 



1296 



gca aca teg acc get atg ate ggt egg tat tgg aat gac acc age tct 
Ala Thr Ser Thr Ala Met He Gly Arg Tyr Trp Asn Asp Thr Ser Ser 
435 440 445 



1344 



cag etc cgt gaa age aaa gga ggg gaa atg ctg act gcg ttg gat ttc 
Gin Leu Arg Glu Ser Lys Gly Gly Glu Met Leu Thr Ala Leu Asp Phe 
450 455 460 



1392 



cac atg aaa gaa tat ggt ctg acg aag gaa gag gcg gca tct aag ttt 
His Met Lys Glu Tyr Gly Leu Thr Lys Glu Glu Ala Ala Ser Lys Phe 
465 470 475 480 



1440 



gaa gga ttg gtt gag gaa aca tgg aag gat ata aac aag gaa ttc ata 
Glu Gly Leu Val Glu Glu Thr Trp Lys Asp He Asn Lys Glu Phe He 
485 490 495 



1488 



gcc aca act aat tat aat gtg ggt aga gaa att gcc ate aca ttc etc 
Ala Thr Thr Asn Tyr Asn Val Gly Arg Glu He Ala He Thr Phe Leu 
500 505 510 



1536 



aac tac get egg ata tgt gaa gcc agt tac age aaa act gac gga gac 
Asn Tyr Ala Arg He Cys Glu Ala Ser Tyr Ser Lys Thr Asp Gly Asp 
515 520 525 



1584 



get tat tea gat cct aat gtt gcc aag gca aat gtc gtt get etc ttt 
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Ala Tyr Ser Asp Pro Asn Val Ala Lys Ala Asn Val Val Ala Leu Phe 
530 535 540 

gtt gat gcc gtc ata ttt 16S0 
Val Asp Ala Val He Phe 
545 550 



<210> 19 
<211> 550 
<212> PRT 

<213> Artificial Sequence 
<400> 19 

Met Ala Thr Asn Gly Val Val He Ser Cys Leu Arg Glu Val Arg Pro 
15 10 IS 

Pro Met Thr Lys His Ala Pro Ser Met Trp Thr Asp Thr Phe Ser Asn 
20 25 30 

Phe Ser Leu Asp Asp Lys Glu Gin Gin Lys Cys Ser Glu Thr He Glu 
35 40 45 

Ala Leu Lys Gin Glu Ala Arg Gly Met Leu Met Ala Ala Thr Thr Pro 
50 55 60 

Leu Gin Gin Met Thr Leu He Asp Thr Leu Glu Arg Leu Gly Leu Ser 
65 70 75 80 

Phe His Phe Glu Thr Glu He Glu Tyr Lys He Glu Leu He Asn Ala 
85 90 95 

Ala Glu Asp Asp Gly Phe Asp Leu Phe Ala Thr Ala Leu Arg Phe Arg 
100 105 110 

Leu Leu Arg Gin His Gin Arg His Val Ser Cys Asp Val Phe Asp Lys 
115 120 125 

Phe He Asp Lys Asp Gly Lys Phe Glu Glu Ser Leu Ser Asn Asn Val 
130 135 140 

Glu Gly Leu Leu Ser Leu Tyr Glu Ala Ala His Val Gly Phe Arg Glu 
145 150 155 160 

Glu Arg He Leu Gin Glu Ala Val Asn Phe Thr Arg His His Leu Glu 
165 170 175 

Gly Ala Glu Leu Asp Gin Ser Pro Leu Leu He Arg Glu Lys Val Lys 
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180 185 190 

Arg Ala Leu Glu His Pro Leu His Axg Asp Pbe Pro He Val Tyr Ala 
195 200 205 

Axg Leu Phe lie Ser He Tyr Glu Lys Asp Asp Ser Arg Asp Glu Leu 
210 215 220 

Leu Leu Lys Leu Ser Lys Val Asn Phe Lys Phe Met Gin Asn Leu Tyr 
225 230 235 240 

Lys Glu Glu Leu Ser Gin Leu Ser Arg Trp Trp Asn Thr Trp Asn Leu 
245 250 255 

Lys Ser Lys Leu Pro Tyr Ala Arg Asp Arg Val Val Glu Ala Tyr Val 
260 265 270 

Trp Gly Val Gly Tyr His Tyr Glu Pro Gin Tyr Ser Tyr Val Arg Met 
275 280 285 

Gly Leu Ala Lys Gly Val Leu He Cys Gly He Met Asp Asp Thr Tyr 
290 295 300 

Asp Asn Tyr Ala Tiir Leu Asn Glu Ala Gin Leu Phe Thr Gin Val Leu 
305 310 315 320 

Asp Lys Trp Asp Arg Asp Glu Ala Glu Arg Leu Pro Glu Tyr Met Lys 
325 330 335 

He Val Tyr Arg Phe He Leu Ser He Tyr Glu Asn Tyr Glu Arg Asp 
340 345 350 

Ala Ala Lys Leu Gly Lys Ser Phe Ala Ala Pro Tyr Phe Lys Glu Thr 
355 360 365 

Val Lys Gin Leu Ala Arg Ala Phe Asn Glu Glu Gin Lys Trp Val Met 
370 375 380 

Glu Arg Gin Leu Pro Ser Phe Gin Asp Tyr Val Lys Asn ser Glu Lys 
385 390 395 400 

Thr Ser Cys He Tyr Thr Met Phe Ala Ser He He Pro Gly Leu Lys 
405 410 415 

Ser Val Thr Gin Glu Thr He Asp Trp He Lys Ser Glu Pro Thr Leu 
420 425 430 

Ala Thr Ser Thr Ala Met He Gly Arg Tyr Trp Asn Asp Thr Ser Ser 
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435 440 445 

Gin Leu Arg Glu Ser Lys Gly Gly Glu Met Leu Thr Ala Leu Asp Phe 
450 455 460 

His Met Lys Giu Tyr Gly Leu Thr Lys Glu Glu Ala Ala Ser Lys Phe 

470 475 480 

Glu Gly Leu Val Glu Glu Thr Trp Lys Asp He Asn Lys Glu Phe lie 
483 490 495 

Ala Thr Thr Asn Tyr Asn Val Gly Arg Glu lie Ala He Thr Phe Leu 
500 505 510 

Asn Tyr Ala Arg He Cys Glu Ala Ser Tyr Ser Lys Thr Asp Gly Asp 
513 520 525 

Ala Tyr Ser Asp Pro Asn Val Ala Lys Ala Asn Val Val Ala Leu Phe 
530 S35 S40 

Val Asp Ala Val He Phe 
545 550 



<210> 20 
<211> 550 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
E-beta-famesene synthase protein 

<220> 

<221> VARIANT 
<222> (1)..{550) 

<223> Computer-generated E-beta-famesene synthase 
protein variant 



<400> 20 

Met Ala Thr Asn Gly Val Leu He 
1 5 

Pro Met Thr Lys His Ala Pro Ser 
20 

Phe Ser Leu Asp Asp Lys Glu Gin 
35 40 



Ser Cys Leu Arg Glu Val Arg pro 
10 15 

Met Trp Thr Asp Thr Phe Ser Asn 
25 30 

Gin Lys cys Ser Glu Thr He Glu 
45 
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Ala Leu Lys Gin Glu Ala Arg Gly Met Leu Met Ala Ala Thr Thr Pro 
SO 55 60 

Leu Gin Gin Met Thr Leu lie Asp Thr Leu Glu Arg Leu Gly Leu Ser 
€5 70 75 80 

Phe His Phe Glu Thr Glu He Glu Tyr Lys lie Glu Leu He Ajsn Ala 
85 90 95 

Ala Glu Asp Asp Gly Phe Asp Leu Phe Ala Thr Ala Leu Arg Phe Arg 
100 105 110 

Leu Leu Arg Gin His Gin Arg His Val Ser Cys Asp Val Phe Asp Lys 
115 120 125 

Phe He Asp Lys Asp Gly Lys Phe Glu Glu Ser Leu Ser Asn Asn Val 
130 135 140 

Glu Gly Leu Leu Ser Leu Tyr Glu Ala Ala His Val Gly Fhe Arg Glu 
145 150 155 160 

Glu Arg lie Leu Gin Glu Ala Val Asn Phe Thr Arg His His Leu Glu 
165 170 175 

Gly Ala Glu Leu Asp Gin Ser Pro Leu Leu He Arg Glu Lys Val Lys 
180 185 190 

Arg Ala Leu Glu His Pro Leu His Arg Asp Phe Pro He Val Tyr Ala 
195 200 205 

Arg Leu Phe He Ser He Tyr Glu Lys Asp Asp Ser Arg Asp Glu Leu 
210 215 220 

I-eu Leu Lys Leu Ser Lys Val Asn Phe Lys Phe Met Gin Asn Leu Tyr 
225 230 235 240 

Lys Glu Glu Leu Ser Gin Leu Ser Arg Trp Trp Asn Thr Trp Asn Leu 
245 250 255 

Lys Ser Lys Leu Pro Tyr Ala Arg Asp Arg Val Val Glu Ala Tyr Val 
260 265 270 

Trp Gly val Gly Tyr His Tyr Glu Pro Gin Tyr Ser Tyr Val Arg Met 
275 280 285 

Gly Leu Ala Lys Gly Val Leu He Cys Gly He Met Asp Asp Thr Tyr 
290 295 300 
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Asp Asn Tyr Ala Thr Leu Asn Glu Ala Gin Leu Phe Thr Gin Val Leu 
305 310 315 320 

Asp Lys Trp Asp Arg Asp Glu Ala Glu Arg Leu Pro Glu Tyr Met Lys 
325 330 335 

He Val Tyr Arg Phe He Leu Ser He Tyr Glu Asn Tyr Glu Arg Asp 
340 345 350 

Ala Ala Lys Leu Gly Lys Ser Phe Ala Ala Pro Tyr Phe Lys Glu Thr 
355 360 365 

Val Lys Gin Leu Ala Arg Ala Phe Asn Glu Glu Gin Lys Trp Val Met 
370 375 380 

Glu Arg Gin Leu Pro ser Phe Gin Asp Tyr Val Lys Asn Ser Glu Lys 
335 390 395 400 

Thr Ser Cys lie Tyr Thr Met Phe Ala Ser He He Pro Gly Leu Lys 
405 410 415 

Ser Val Thr Gin Glu Thr He Asp Trp He Lys Ser Glu Pro Thr Leu 
420 425 430 

Ala Thr Ser Thr Ala Met He Gly Arg Tyr Trp Asn Asp Thr Ser Ser 
435 440 445 

Gin Leu Arg Glu Ser Lys Gly Gly Glu Met Leu Thr Ala Leu Asp Phe 
450 455 460 

His Met Lys Glu Tyr Gly Leu Thr Lys Glu Glu Ala Ala Ser Lys Phe 
465 470 475 480 

Glu Gly Leu Val Glu Glu Thr Trp Lys Asp He Asn Lys Glu Phe He 
485 490 495 

Ala Thr Thr Asn Tyr Asn Val Gly Arg Glu He Ala He Thr Phe Leu 
500 505 510 

Asn Tyr Ala Arg He Cys Glu Ala Ser Tyr Ser Lys Thr Asp Gly Asp 
515 520 525 

Ala Tyr Ser Asp Pro Asn Val Ala Lys Ala Asn Val Val Ala Leu Tyr 
530 535 540 

Val Asp Ala He Val Phe 
545 550 
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<210> 21 
<21i:> 550 
<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: 
E-beta-famesene synthase protein 

<220> 

<221> VARIANT 
<222> fl) - . (550) 

<223> Conputer- gene rated E-beta-f arnesene synthase 
protein variant 



<400> 21 

Met Ala Thr Asn Gly Val Val lie Ser Cys Leu Arg Glu Val Arg Pro 
15 10 15 

Pro Met Thr Lys His Gly Pro Ser Met Trp Thr Asp Thr Phe Ser Asn 
20 25 30 

Phe Ser Leu Asp Asp Lys Glu Gin Gin Lys Cys Ser Glu Thr lie Glu 
35 40 45 

Ala Leu Lys Gin Glu Ala Arg Gly Met Leu Met Ala Ala Thr Thr Pro 
50 55 60 

Leu Gin Gin Met Thr Leu lie Asp Thr Leu Glu Arg Leu Gly Leu Ser 
€5 70 75 80 

Phe His Phe Glu Thr Glu lie Glu Tyr Lys lie Glu Leu lie Asn Ala 
85 90 95 



Ala Glu Asp Asp Gly Phe Asp Leu Phe Ala Thr Ala Leu Arg Phe Arg 
100 105 110 

Leu Leu Arg Gin His Gin Arg His Val Ser Cys Asp Val Phe Asp Lys 
115 120 125 

Phe lie Asp Lys Asp Gly Lys Phe Glu Glu Ser Leu Ser Asn Asn Val 
130 135 140 

Glu Gly Leu Leu Ser Leu Tyr Glu Ala Ala His Val Gly Phe Arg Glu 
145 ISO 155 160 
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Glu Arg lie Leu Gin Gin Ala Val Asn Phe Thr Arg His His Leu Glu 
165 170 175 

Gly Ala Glu Leu Asp Gin Ser Pro Leu Leu lie Arg Glu Lys Val Lys 
180 185 190 

Arg Ala Leu Glu His Pro Leu His Arg Asp Phe Pro He Val Tyr Ala 
195 200 205 

Arg Leu Phe He Ser He Tyr Glu Lys Asp Asp Ser Arg Asp Glu Leu 
210 215 220 

Leu Leu Lys Leu Ser Lys Val Asn Phe Lys Phe Met Gin Asn Leu Tyr 
225 230 235 240 

Lys Glu Glu Leu Ser Gin Leu Ser Arg Trp Trp Asn Thr Trp Asn Leu 
245 250 255 

Lys Ser Lys Leu Pro Tyr Ala Arg Asp Arg Val Val Glu Ala Tyr Val 
260 2€5 270 

Trp Gly Val Gly Tyr His Tyr Glu Pro Gin Tyr Ser Tyr Val Arg Met 
275 280 285 

Gly Leu Ala Lys Gly Val Leu He Cys Gly He Met Asp Asp Thr Tyr 
2B0 295 300 

Asp Asn Tyr Ala Thr Leu Asn Glu Ala Gin Leu Phe Thr Gin Val Leu 
305 310 315 320 

Asp Lys Trp Asp Arg Asp Glu Ala Glu Arg Leu Pro Glu Tyr Met Lys 
325 330 335 

He Val Tyr Arg Phe He Leu Ser He Tyr Glu Asn Tyr Glu Arg Asp 
340 345 350 

Ala Ala Lys Leu Gly Lys Ser Phe Ala Ala Pro Tyr Phe Lys Glu Thr 
355 360 365 

Val Lys Gin Leu Ala Arg Ala Phe Asn Glu Glu Gin Lys Trp Val Met 
370 375 380 

Glu Arg Gin Leu Pro Ser Phe Gin Asp Tyr Val Lys Asn Ser Glu Lys 
385 390 395 

Thr Ser Cys He Tyr Thr Met Phe Ala Ser He He Pro Gly Leu Lys 
405 410 415 



56 

SUBSTITUTE SHEET (RULE 26) 



BN^XXilD: <VVO. 



.S91BtiaA1J_> 



wo 99/18118 



PCT/US98/208S5 



Ser Val Thr Gin Glu Thr He Asp Trp lie Lys Ser Glu Pro Thr Leu 
420 425 430 

Ala Thr Ser Thr Ala Met lie Gly Arg Tyr Trp Asn Asp Thr Ser Ser 
435 440 445 

Gin Leu Arg Glu Ser Lys Gly Gly Glu Met Leu Thr Ala Leu Asp E>he 
450 455 

His Mfet Lys Glu Tyr Gly Leu Thr Lys Glu Glu Ala Ala Sex Lys Phe 

470 475 480 

Glu Gly Leu Val Glu Glu Thr Trp Lys Asp He Asn Lys Glu Phe He 
485 490 

Ala Thr Thr Asn Tyr Asn Val Gly Arg Glu He Ala He Thr Phe Leu 
500 505 510 

Asn Tyr Ala Arg He Cys Glu Ala Ser Tyr Ser Lys Thr Asp Gly Asp 
515 520 525 

Ala Tyr Ser Asp Pro Asn Val Ala Lys Ala Asn Val Val Ala Leu Phe 
530 535 540 

Val Asp Ala He Val Phe 
545 550 



<210> 22 
<211> 550 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
E-beta-famesene synthase protein 

<220> 

<221> VARIANT 
<222> a) (550) 

<223> Coinputer-generated E-beta-famesene synthase 
protein variant 

<400> 22 

Met Ala Thr Asn Gly Val Val He ser Cys Leu Arg Glu Val Arg Pro 
^5 10 15 

Pro Met Thr Lys His Ala Pro Ser Met Trp Thr Asp Thr Phe Ser Asn 
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20 25 30 

Phe Ser Leu Asp Asp Lys Glu Gin Gin Lys Cys Ser Glu Thr lie Glu 
35 40 45 

Ala Leu J*ys Gin Glu Ala Arg Gly Met Leu Met Ala Ala Thr Thr Pro 
50 55 60 

Leu Gin Gin Met Thr Leu lie Asp Thr Leu Glu Arg Leu Gly Leu Ser 
^5 70 75 60 

Phe His Phe Glu Thr Glu lie Glu Tyr Lys lie Glu Leu He Asn Ala 
95 90 95 

Ala Glu Asp Asp Gly Phe Asp Leu Phe Ala Thr Ala Leu Arg Phe Arg 
100 105 110 

Leu Leu Arg Gin His Gin Arg His Val Ser Cys Asp Val Phe Glu Lys 
lis 120 125 

Phe lie Asp Lys Asp Gly Lys Phe Glu Glu Ser Leu Ser Asn Asn Val 
130 135 140 

Glu Gly Leu Leu Ser Leu Tyr Glu Ala Ala His Val Gly Phe Arg Glu 

150 155 160 

Glu Arg He Leu Gin Glu Ala Val Asn Phe Thr Arg His His Leu Glu 
165 170 175 

Gly Ala Glu Leu Asp Gin Ser Pro Leu Leu He Arg Glu Lys Val Lys 
ISO 185 190 

Arg Ala Leu Glu His Pro Leu His Arg Asp Phe Pro lie Val Tyr Ala 
195 200 205 

Arg Leu Phe He Ser Xle Tyr Glu Lys Asp Asp Ser Arg Asp Glu Leu 
210 215 220 

Leu Leu Lys Leu Ser Lys Val Asn Phe Lys Phe Met Gin Asn Leu Tyr 
225 230 235 240 

Lys Glu Glu Leu Ser Gin Leu Ser Arg Trp Trp Asn Thr Trp Asn Leu 
245 250 255 

Lys ser Lys Leu Pro Tyr Ala Arg Asp Arg Val Val Glu Ala Tyr Val 
260 265 270 

Trp Gly val Gly Tyr His Tyr Glu Pro Gin Tyr Ser Tyr Val Arg Met 
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27S 280 285 

Gly Leu Ala Lys GXy Val Leu He Cys Gly He Met Asp Asp Thr Tyr 
290 295 300 

Asp Asn Tyr Ala Thr Leu Asn Glu Ala Gin Leu Phe Thr Gin Val Leu 
305 310 315 320 

Asp Lys Trp Asp Arg A^p Glu Ala Glu Arg Leu Pro Glu Tyr Met Lys 
325 330 335 

He Val Tyr Arg Phe He Leu Ser He Tyr Glu Asn Tyr Glu Arg Asp 
340 343 350 

Ala Ala Lys Leu Gly Lys Ser Phe Ala Ala Pro Tyr Phe Lys Glu Thr 
355 360 365 

Val Lys Gin Leu Ala Arg Ala Phe Asn Glu Glu Gin Lys Trp Val Met 
370 375 380 

Glu Arg Gin Leu Pro ser Phe Gin Asp Tyr Val Lys Asn Ser Glu Lys 
385 390 395 400 

Thr Ser Cys He Tyr Thr Met Phe Ala Ser He He Pro Gly Leu Lys 
405 410 415 

Ser Val Thr Gin Glu Thr He Asp Trp He Lys Ser Glu Pro Thr Leu 
420 425 430 

Ala Thr Ser Thr Ala Met He Gly Arg Tyr Trp Asn Asp Thr Ser Ser 
435 440 445 

Gin Leu Arg Glu Ser Lys Gly Gly Glu Met Leu Thr Ala Leu Asp Phe 
450 455 460 

His Met Lys Glu Tyr Gly Leu Thr Lys Glu Glu Ala Ala Ser Lys Phe 
465 470 475 480 

Glu Gly Leu Val Glu Glu Thr Trp Lys Asp He Asn Lys Glu Phe He 
485 490 495 

Ala Thr Thr Asn Tyr Asn Val Gly Arg Glu He Ala He Thr Phe Leu 
500 505 510 

Asn Tyr Ala Arg He cys Glu Ala ser Tyr Ser Lys Thr Asp Gly Asp 
515 520 525 

Ala Tyr Ser Asp Pro Asn Val Ala Lys Ala Asn Val Val Ala Leu Phe 
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530 S35 540 

Val Asp Ala He Val Phe 
54S 550 



<210> 23 
<211> 550 
<2X2> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
E-beta-f arnesene synthase protein 

<220> 

<221> VARIANT 
<222> (1) . . (550) 

<223> Computer-generated E-beta-f arnesene synthase 
protein variant 

<400> 23 

Met Ala Thr Asn Gly Val Val lie Ser Cys Leu Arg Glu Val Arg Pro 
is 10 15 

Pro Met Thr Lys His Ala Pro Ser Met Trp Thr Asp Thr Phe Ser Asn 
20 25 30 

Phe Ser Leu Asp Asp Lys Glu Gin Gin Lys Cys Ser Glu Thr He Glu 
35 40 45 

Ala Leu Lys Gin Glu Ala Arg Gly Met Leu Met Ala Ala Thr Thr Pro 
50 55 60 

Leu Gin Gin Met Thr Leu lie Asp Thr Leu Glu Arg Leu Gly Leu Ser 
65 70 75 80 

Phe His Phe Glu Thr Glu lie Glu Tyr Lys lie Glu Leu lie Asn Ala 
85 90 95 

Ala Glu Asp Asp Gly Phe Asp Leu Phe Ala Thr Ala Leu Arg Phe Arg 
100 105 110 

Leu Leu Arg Gin His Gin Arg His Val Ser Cys Asp Val Phe Asp Lys 
115 120 125 

Phe lie Asp Lys Asp Gly Lys Phe Glu Glu Ser Leu Ser Asn Asn Val 
130 135 140 
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Glu Gly Leu Leu Ser Leu Tyr Glu Ala Gly His Val Gly Phe Arg Glu 
145 150 155 160 

Glu Arg lie Leu Gin Glu Ala Val Asn Phe Thr Arg His His Leu Glu 
165 170 175 

Gly Ala Glu Leu Asp Gin Ser Pro Leu Leu lie Arg Glu Lys Val Lys 
180 185 190 

Arg Ala Leu Glu His Pro Leu His Arg Asp Phe Pro He Val Tyr Ala 
195 200 205 

Arg Leu Phe He Ser He Tyr Glu Lys Asp Asp Ser Arg Asp Glu Leu 
210 215 220 

Leu Leu Lys Leu Ser Lys Val Asn Phe Lys Phe Met Gin Asn Leu Tyr 
225 230 235 240 

Lys Glu Glu Leu Ser Gin Leu Ser Arg Trp Trp Asn Thr Trp Asn Leu 
245 250 255 

Lys Ser Lys Leu Pro Tyr Ala Arg Asp Arg Val Val Glu Ala Tyr Val 
260 265 270 

Trp Gly Val Gly Tyr His Tyr Glu Pro Gin Tyr Ser Tyr Val Arg Met 
275 280 285 

Gly Leu Ala Lys Gly Val Leu He Cys Gly He Met Asp Asp Thr Tyr 
290 295 300 

Asp Asn Tyr Ala Thr Leu Asn Glu Ala Gin Leu Phe Thr Gin Val Leu 
^05 310 315 320 

Asp Lys Trp Asp Arg Asp Glu Ala Glu Arg Leu Pro Glu Tyr Met Lys 
325 330 335 

He Val Tyr Arg Phe He Leu Ser He Tyr Glu Asn Tyr Glu Arg Asp 
340 345 350 

Ala Ala Lys Leu Gly Lys Ser Phe Ala Ala Pro Tyr Phe Lys Glu Thr 
335 360 365 

Val Lys Gin Leu Ala Arg Ala Phe Asn Glu Glu Gin Lys Trp Val Met 
370 375 380 

Glu Arg Gin Leu Pro Ser Phe Gin Asp Tyr Val Lys Asn Ser Glu Lys 

390 395 400 
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Tiir Ser Cys He Tyr Thr Met Phe Ala Ser He He Pro Gly Leu Lys 
405 410 415 

Ser Val Tiir Gin Glu Thr He Asp Trp He Lys Ser Glu Pro Thr Leu 
420 425 430 

Ala Thr Ser Thr Ala Met He Gly Arg Tyr Trp Asn Asp Thr Ser Ser 
435 440 445 

Gin Leu Arg Glu Ser Lys Gly Gly Glu Met Leu Thr Ala Leu Asp Phe 
450 455 460 

His Met Lys Glu Tyr Gly Leu Thr Lys Glu Glu Ala Ala Ser Lys Phe 
465 470 475 480 

Glu Gly Leu Val Glu Glu Thr Trp Lys Asp He Asn Lys Glu Phe He 
435 490 495 

Ala Thr Thr Asn Tyr Asn Val Gly Arg Glu He Ala He Thr Phe Leu 
500 505 510 

Asn Tyr Ala Arg He Cys Glu Ala Ser Tyr Ser Lys Thr Asp Gly Asp 
515 520 525 

Ala Tyr Ser Asp Pro Asn Val Ala Lys Ala Asn Val Val Ala Leu Phe 
530 535 540 

Val Asp Ala He Val Phe 
545 550 



<210> 24 
<211> 550 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description ot Artificial Sequence: 
E-beta-farnesene synthase protein 

<220> 

<221> VARIANT 
<222> {1) . . (550) 

<223> Coitiputer-generated £-beta-farnesene synthase 
protein variant 

<400> 24 
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Met Ala Thr Asn Giy Val Val He Ser Cys Leu Arg Glu Val Arg Pro 
15 10 15 

Pro Met Thr Lys His Ala Pro Ser Met Tirp Thr Asp Thr Phe Ser Asn 
20 25 30 

Phe Ser Leu Asp Asp Lys Glu Gin Gin Lys Cys Ser Glu Thr He Glu 
3S 40 45 

Ala Leu Lys Gin Glu Ala Arg Gly Met Leu Met Ala Ala Thr Thr Pro 
50 55 60 

Leu Gin Gin Met Thr Leu He Asp Thr Leu Glu Arg Leu Gly Leu Ser 
65 70 75 80 

Phe His Phe Glu Thr Glu He Glu Tyr Lys He Glu Leu He Ajsn Ala 
85 90 95 

Ala Glu Asp Asp Gly Phe Asp Leu Phe Ala Thr Ala Leu Arg Phe Arg 
100 105 110- 

Leu Leu Arg Gin His Gin Arg His Val Ser Cys Asp Val Phe Asp Lys 
115 120 125 

Phe He Asp Lys Asp Gly Lys Phe Glu Glu Ser Leu Ser Asn Asn Val 
130 135 140 

Glu Gly Leu Leu Ser Leu Tyr Glu Ala Ala His Val Gly Phe Arg Glu 
145 150 155 160 

Glu Arg He Leu Gin Glu Ala Val Asn Phe Ser Arg His His Leu Glu 
165 170 175 

Gly Ala Glu Leu Asp Gin Ser Pro Leu Leu He Arg Glu Lys Val Lys 
180 185 190 

Arg Ala Leu Glu His Pro Leu His Arg Asp Phe Pro He Val Tyr Ala 
195 200 205 

Arg Leu Phe He Ser He Tyr Glu Lys Asp Asp Ser Arg Asp Glu Leu 
210 215 220 

Leu Leu Lys Leu Ser Lys Val Asn Phe Lys Phe Met Gin Asn Leu Tyr 
225 130 235 240 

Lys Glu Glu Leu Ser Gin Leu Ser Arg Trp Trp Asn Thr Trp Asn Leu 
245 250 255 
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Lys Ser Lys Leu Pro Tyr Ala Arg Asp Arg Val Val Glu Ala Tyr Val 
260 265 270 

Trp Gly Val Gly Tyr His Tyr Glu Pro Gin Tyr ser Tyr Val Arg Met 
275 280 285 

Gly Leu Ala Lys Gly Val Leu lie Cys Gly He Met Asp Asp Thr Tyr 
290 295 300 

Asp Asn Tyr Ala Thr Leu Asn Glu Ala Gin Leu Phe Thr Gin Val Leu 
305 310 315 320 

Asp Lys Trp Asp Arg Asp Glu Ala Glu Arg Leu Pro Glu Tyr Met Lys 
325 330 335 

He Val Tyr Arg Phe lie Leu Ser He Tyr Glu Asn Tyr Glu Arg Asp 
340 345 350 

Ala Ala Lys Leu Gly Lys Ser Phe Ala Ala Pro Tyr Phe Lys Glu Thr 
355 360 365 

Val Lys Gin Leu Ala Arg Ala Phe Asn Glu Glu Gin Lys Trp Val Met 
370 375 380 

Glu Arg Gin Leu Pro Ser Phe Gin Asp Tyr Val Lys Asn Ser Glu Lys 
385 390 395 400 

Thr Ser Cys He Tyr Thr Met Phe Ala Ser He He Pro Gly Leu Lys 
405 410 415 

Ser Val Thr Gin Glu Thr He Asp Trp He Lys Ser Glu Pro Thr Leu 
420 425 430 

Ala Thr Sex Thr Ala Met He Gly Arg Tyr Trp Asn Asp Thr Ser Ser 
435 440 445 

Gin Leu Arg Glu Ser Lys Gly Gly Glu Met Leu Thr Ala Leu Asp Phe 
450 455 460 

His Met Lys Glu Tyr Gly Leu Thr Lys Glu Glu Ala Ala Ser Lys Phe 
465 470 475 480 

Glu Gly Leu Val Glu Glu Thr Trp Lys Asp He Asn Lys Glu Phe He 
485 490 495 

Ala Thr Thr Asn Tyr Asn Val Gly Arg Glu He Ala He Thr Phe Leu 
500 505 510 
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T^n Tyr Ala Arg lie Cys Glu Ala Ser Tyr Ser hys Thr Asp Gly Asp 
515 520 525 

Ala Tyr Ser Asp Pro Asn Val Ala Lys Ala Asn Val Val Ala Leu Phe 
530 535 540 

Val Asp Ala He Val Phe 
545 550 



<210> 25 
<211> 550 
<212> PRT 

<213> Artificial Sequence 
<220> 

<:223> Description of Artificial Sequence: 
E-beta-f arnesene synthase protein 

<220> 

<221> VARIANT 
<222> (1),.(550) 

<223> Computer-generated E-beta-famesene synthase 
protein variant 

<400> 25 

Met Ala Thr Asn Gly Val Val He Ser Cys Leu Arg Glu Val Arg Pro 
^5 10 15 

Pro Met Thr Lys His Ala Pro Ser Met Trp Thr Asp Thr Phe Ser Asn 
20 25 30 

Phe Ser Leu Asp Asp Lys Glu Gin Gin Lys Cys Ser Glu Thr He Glu 
35 40 45 

Ala Leu Lys Gin Glu Ala Arg Gly Met Leu Met Ala Ala Thr Thr Pro 
SO 55 60 

Leu Gin Gin Met Thr Leu He Asp Thr Leu Glu Arg Leu Gly Leu Ser 

70 75 80 

Phe His Phe Glu Thr Glu He Glu Tyr Lys He Glu Leu He Asn Ala 
85 90 95 

Ala Glu Asp Asp Gly Phe Asp Leu Phe Ala Thr Ala Leu Arg Phe Arg 
100 105 110 

Leu Leu Arg Gin His Gin Arg His Val Ser Cys Asp Val Phe Asp Lys 
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1" 120 125 

Phe lie Asp Lys Asp Gly Lys Phe Glu Glu Ser Leu Ser Asn Asn Val 
130 135 140 

Glu Gly Leu Leu Ser Leu Tyr Glu Ala Ala His Val Gly Phe Arg Glu 

150 155 j^gQ 

Glu Arg lie Leu Gin Glu Ala Val Asn Phe Thr Arg His His Leu Glu 
165 170 175 

Gly Ala Glu Leu Asp Gin Ser Pro Leu Leu lie Arg Glu Lys Val Lys 
180 185 190 

Arg Ala Leu Glu His Pro Leu His Arg Asp Phe Pro lie Val Tyr Ala 
155 200 205 

Arg Leu Phe He Thr lie Tyr Glu Lys Asp Asp Ser Arg Asp Glu Leu 
210 215 220 

Leu Leu Lys Leu Ser Lys Val Asn Phe Lys Phe Met Gin Asn Leu Tyr 

230 235 240 

Lys Glu Glu Leu Ser Gin Leu Ser Arg Trp Trp Asn Thr Trp Asn Leu 
245 250 255 

Lys Ser Lys Leu Pro Tyr Ala Arg Asp Arg Val Val Glu Ala Tyr Val 
260 265 270 

Trp Gly Val Gly Tyr His Tyr Glu Pro Gin Tyr Ser Tyr Val Arg Met 
275 280 285 

Gly Leu Ala Lys Gly Val Leu lie Cys Gly lie Met Asp Asp Thr Tyr 
290 295 300 

Asp Asn Tyr Ala Thr Leu Asn Glu Ala Gin Leu Phe Thr Gin Val Leu 

310 3X5 320 

Asp Lys Trp Asp Arg Asp Glu Ala Glu Arg Leu Pro Glu Tyr Met Lys 
325 330 335 

lie Val Tyr Arg Phe He Leu Ser lie Tyr Glu Asn Tyr Glu Arg Asp 
340 345 

Ala Ala Lys Leu Gly Lys Ser Phe Ala Ala Pro Tyr Phe Lys Glu Thr 
355 360 365 

Val Lys Gin Leu Ala Arg Ala Phe Asn Glu Glu Gin Lys Trp Val Met 
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370 375 

Glu Arg Gin I,eu Pro Ser Phe Gin Asp Tyr Val Lys Asn Ser Glu Lys 

390 395 400 

Thj: Ser Cys lie Tyr Ser Met: Phe Ala Ser lie He Pro Gly Leu X.ys 
405 410 415 

Ser Val Thr Gin Glu Thr lie Asp Trp He Lys ser Glu Pro Thr Leu 
420 425 430 

Ala Thr ser Thr Ala Met He Gly Arg Tyr Trp Asn Asp Thr Ser Ser 
435 440 445 

Gin Leu Arg Glu Ser Lys Gly Gly Glu Met Leu Thr Ala Leu Asp Phe 
430 455 

His Met Lys Glu Tyr Gly Leu Thr Lys Glu Glu Ala Ala Ser Lys Phe 

470 475 480 

Glu Gly Leu Val Glu Glu Thr Trp Lys Asp He Asn Lys Glu Phe He 
485 490 495 

Ala Thr Thr Asn Tyr Asn Val Gly Arg Glu He Ala He Thr Phe Leu 
500 505 510 

Asn Tyr Ala Arg Val Cys Glu Ala Ser Tyr Thr Lys Thr Asp Gly Asp 

520 525 

Ala Tyr Ser Asp Pro Asn Val Ala Lys Ala Asn Val Val Ala Leu Phe 
530 535 540 

Val Asp Ala He Val Phe 
543 550 



<210> 26 
<211> 550 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
E-beta-famesene synthase protein 

<220> 

<221> VARIANT 
<222> (X)--(S50) 
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<223> Con^uter-generatied £-l>eta-farnesene synt:hase 
protein vaclan-t 

<400> 26 

Met Ala Thr Asn Gly Val Val He Ser Cys Leu Arg Clu VaX Arg Pro 
15 10 15 

Pro Met Thr Lys His Ala Pro Ser Met Trp Thr Asp Thr Phe Ser Asn 
20 25 30 

Phe Ser Leu Asp Asp Lys Glu Gin Gin Lys Cys Ser Glu Thr He Glu 
35 40 45 

Ala Leu Lys Gin Glu Ala Arg Gly Met Leu Met Ala Ala Thr Thr Pro 
50 55 60 

Leu Gin Gin Met Thr Leu He Asp Thr Leu Glu Arg Leu Gly Leu Ser 
€5 70 75 80 

Phe His Phe Glu Thr Glu He Glu Tyr Lys He Glu Leu He Asn Ala 
85 90 95 

Ala Glu Asp Asp Ala Phe Asp Leu Phe Ala Thr Ala Leu Arg Phe Arg 
100 105 110 

Leu Leu Arg Gin His Gin Arg His Val Ser Cys Asp Val Phe Asp Lys 
115 120 125 

Phe He Asp Lys Asp Gly Lys Phe Glu Glu Ser Leu Ser Asn Asn Val 
130 135 140 

Glu Gly Leu Leu Ser Leu Tyr Glu Ala Ala His Val Gly Phe Arg Glu 
145 150 155 160 

Glu Arg He Leu Gin Glu Ala Val Asn Phe Thr Arg His His Leu Glu 
165 170 175 

Gly Ala Glu Leu Asp Gin Ser Pro Leu Leu He Arg Glu Lys Val Lys 
180 185 190 

Arg Ala Leu Glu His Pro Leu His Arg Asp Phe Pro He Val Tyr Ala 
195 200 205 

Arg Leu Phe He Ser He Tyr Glu Lys Asp Asp Ser Arg Asp Glu Leu 
210 215 220 

Leu Leu Lys Leu Ser Lys Val Asn Phe Lys Phe Met Gin Asn Leu Tyr 
225 230 235 240 
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Lys Glu Asp Leu Ser Gin Leu ser Arg Trp Trp Asn Thr Trp Asn Leu 
245 250 255 

Lys Ser Lys Leu Piro Tyr Ala Arg Asp Arg Val Val Glu Ala Tyr Val 
260 265 270 

Trp Gly Val Gly Tyr His Tyr Glu Pro Gin Tyr ser Tyr Val Arg Met 
275 280 2B5 

Gly Leu Ala Lys Gly Val Leu lie Cys Gly lie Met Asp Asp Thr Tyr 
290 295 300 

Asp Asn Tyr Ala Thr Leu Asn Glu Ala Gin Leu Phe Thr Gin Val Leu 
305 310 315 320 

Asp Lys Trp Asp Arg Asp Glu Ala Glu Arg Leu Pro Glu Tyr Met Lys 
325 330 335 

lie Val Tyr Arg Phe lie Leu Ser lie Tyr Glu Asn Tyr Asp Arg Asp 
340 345 350 

Ala Ala Lys Leu Gly Lys Ser Phe Ala Ala Pro Tyr Phe Lys Glu Thr 
355 360 365 

Val Lys Gin Leu Ala Arg Ala Phe Asn Glu Glu Gin Lys Trp Val Met 
370 375 380 

Glu Arg Gin Leu Pro Ser Phe Gin Asp Tyr Val Lys Asn ser Glu Lys 
385 390 395 400 

Thr Ser Cys He Tyr Thr Met Phe Ala Ser He He Pro Gly Leu Lys 
405 410 415 

Ser Val Thr Gin Glu Thr He Asp Trp He Lys Ser Glu Pro Thr Leu 
420 425 430 

Ala Thr Ser Thr Ala Met He Gly Arg Tyr Trp Asn Asp Thr Ser Ser 
435 440 445 

Gin Leu Arg Glu Ser Lys Gly Gly Glu Met Leu Thr Ala Leu Asp phe 
450 455 460 

His Met Lys Glu Tyr Gly Leu Thr Lys Glu Glu Ala Ala Ser Lys Phe 
465 470 475 480 

Glu Gly Leu Val Glu Glu Thr Trp Lys Asp He Asn Lys Glu Phe He 
485 490 495 
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Ala Thr Thr Asn Tyr Asn Val Gly 

500 

Asn Tyr Ala Arg lie Cys Glu Ala 
515 520 

Ala Tyr Ser Asp Pro Asn Val Ala 
530 535 

Val Asp Ala lie Val Phe 
545 S50 



Arg Glu lie Ala lie Thr Phe Leu 
S05 510 

Ser Tyr Ser Lys Thr Asp Gly Asp 
525 

Lys Ala Asn Val Val Ala Leu Phe 
540 



<210> 27 
<211> 350 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 

E-beta-f arnesene synthase protein variant 

<220> 

<221> VARIANT 
<222> (1) . . (550) 

<223> Co&tputer-generated £-beta-f arnesene synthase 
protein variant 

<400> 27 

Met Ala Thr Asn Gly Val Val lie Ser Cys Leu Arg Glu Val Arg Pro 
15 10 15 

Pro Met Thr Lys His Ala Pro Ser Met Trp Thr Asp Thr Phe ser Asn 
20 25 30 

Phe Ser Leu Asp Asp Lys Glu Gin Gin Lys Cys Ser Glu Thr lie Glu 
35 40 45 

Ala Leu Lys Gin Glu Ala Arg GXy Met Leu Met Ala Ala Thr Ser Pro 
50 55 60 

Leu Gin Gin Met Thr Leu lie Asp Thr Leu Glu Arg Leu Gly Leu Ser 
€5 70 75 80 

Phe His Phe Glu Thr Glu lie Glu Tyr Lys lie Glu Leu lie Asn Ala 
85 90 95 
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ALa Glu JEksp Asp Gly Phe Asp Leu Phe ALa Thr Ala Leu Arg Phe Arg 
100 105 110 

Leu Leu Arg Gin His Gin Arg His Val Ser Cys Asp Val Phe Asp Lys 
115 120 125 

Plie lie Asp Lys Asp Gly Lys Phe Glu Glu Ser Leu Ser Asn Asn Val 
130 135 140 

Glu Gly Leu Leu Ser Leu Tyr Glu Ala Ala His Val Gly Phe Arg Glu 
145 150 155 160 

Glu Arg He Leu Gin Glu Ala Val Asn Phe Thr Arg His His Leu Glu 
165 170 175 

Gly Ala Glu Leu Asp Gin Ser Pro Leu Leu He Arg Glu Lys Val Lys 
180 185 ISO 

Arg Ala Leu Glu His Pro Leu His Arg Asp Phe Pro He Val Tyr Ala 
195 200 205 

Arg Leu Phe He Ser He Tyr Glu Lys Asp Asp Ser Arg Asp Glu Leu 
210 215 220 

Leu Leu Lys Leu Ser Lys Val Asn Phe Lys Phe Met Gin Asn Leu Tyr 
225 230 235 240 

Lys Glu Glu Leu Ser Gin Leu Ser Arg Trp Trp Asn Thr Trp Asn Leu 
245 250 255 

Lys Ser Lys Leu Pro Tyr Ala Arg Asp Arg Val Val Glu Ala Tyr Val 
260 265 270 

Trp Gly Val Gly Tyr His Tyr Glu Pro Gin Tyr Ser Tyr Val Arg Met 
275 260 285 

Gly Leu Ala Lys Gly Val Leu He Cys Gly He Met Asp Asp Thr Tyr 
290 295 300 

Asp Asn Tyr Ala Thr Leu Asn Glu Ala Gin Leu Phe Thr Gin Val Leu 
305 310 315 320 

Asp Lys Trp Asp Arg Asp Glu Ala Glu Arg Leu Pro Glu Tyr Met Lys 
325 330 335 

He Val Tyr Arg Phe He Leu Ser He Tyr Glu Asn Tyr Glu Arg Asp 
340 345 350 
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Ala Ala Lys Leu Gly Lys Ser Phe Ala Ala Pro Tyr Plie Lys Glu Thr 
355 360 365 

val Lys Gin Leu Ala Arg Ala Phe Asn Asp Glu Gin Lys Trp Val Met 
370 375 380 

Glu Arg Gin Leu Pro Ser Phe Gin Asp Tyr Val Lys Asn Ser Glu Lys 
385 390 395 400 

Thr Ser Cys lie Tyr Thr Met Phe Ala Ser lie He Pro Gly Leu Lys 
405 410 415 

ser Val Thr Gin Glu Thr He Asp Trp He Lys Ser Glu Pro Thr Leu 
420 425 430 

Ala Thr ser Thr Ala Met He Gly Arg Tyr Trp Asn Asp Thr Ser Ser 
435 440 445 

Gin Leu Arg Glu Ser Lys Gly Gly Glu Met Leu Thr Ala Leu Asp Phe 
450 455 460 

His Met Lys Glu Tyr Gly Leu Thr Lys Glu Glu Ala Ala Ser Lys Phe 
465 470 475 480 

Glu Gly Leu Val Glu Glu Thr Trp Lys Asp He Asn Lys Glu Phe He 
485 490 495 

Ala Thr Thr Asn Tyr Asn Val Gly Arg Glu He Ala He Thr Phe Leu 
500 . 505 510 

Asn Tyr Ala Arg He Cys Glu Ala Ser Tyr Ser Lys Thr Asp Gly Asp 
515 520 525 

Ala Tyr Ser Asp Pro Asn Val Ala Lys Ala Asn Val Val Ala Leu Phe 
S30 535 540 

Val Asp Ala He Val Phe 
545 550 



<210> 28 
<211> 550 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
B-beta-f arnesene synthase protein 
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<220> 

<22X> VARIANT 
<222> {13 . , (550) 

<223> computier-generated E-beta-f am esene synthase 
protein variant 

<400> 28 

Met Ala Thr Asn Gly Val Val He Ser cys Leu Arg Glu Val Arg Pro 
15 10 15 

Pro Met Thr iys His Ala Pro Ser Met Trp Thr Asp Thr Phe Ser Asn 
20 2S 30 

Phe Ser Leu Asp Asp Lys Glu Gin Gin Lys Cys Ser Glu Thr lie Glu 
35 40 45 

Ala Leu Lys Gin Glu Ala Arg Gly Met Leu Met Ala Ala Thr Thr Pro 
50 55 60 

Leu Gin Gin Met Thr Leu He Asp Thr Leu Glu Arg Leu Gly Leu Ser 
65 70 75 80 

Phe His Phe Glu Thr Glu He Glu Tyr Lys He Glu Leu He Asn Ala 
85 90 95 

Ala Glu Asp Asp Gly Phe Asp Leu Phe Ala Thr Ala Leu Arg Phe Arg 
100 105 110 

Leu Leu Arg Gin His Gin Arg His val Ser cys Asp Val Phe Asp Lys 
115 120 125 

Phe He Asp Lys Asp Gly Lys Phe Glu Glu Ser Leu Ser Asn Asn Val 
130 135 140 

Glu Gly Leu Leu Ser Leu Tyr Glu Ala Ala His Val Gly Phe Arg Glu 
145 ISO 155 160 

Glu Arg He Leu Gin Glu Ala Val Asn Phe Thr Arg His His Leu Glu 
165 170 175 

Gly Ala Glu Leu Asp Gin Ser Pro Leu Leu He Arg Glu Lys Val Lys 
180 185 190 

Arg Ala Leu Glu His Pro Leu His Arg Asp Phe Pro He val Tyr Ala 
195 200 205 

Arg Leu Phe He ser He Tyr Glu Lys Asp Asp Ser Arg Asp Glu Leu 
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210 215 220 

Leu Leu Lys Leu Ser Lys Val Asn Phe Lys Phe Met Gin Asn Leu Tyr 
225 230 235 240 

Lys Glu Glu Leu Ser Gin Leu Ser Arg Trp Trp Asn Thr Trp Asn Leu 
245 250 255 

Lys Ser Lys Leu Pro Tyr Ala Arg Asp Arg Val Val Glu Ala Tyr Val 
260 265 270 

Trp Gly Val Gly Tyr His Tyr Glu Pro Gin Tyr Ser Tyr Val Arg Mtet 
275 280 2B5 

Gly Leu Ala Lys Gly Val Leu He Cys Gly lie Met Asp Asp Thr Tyr 
290 295 300 

Asp Asn Tyr Ala Thr Leu Asn Glu Ala Gin Leu Phe Thr Gin Val Leu 
305 310 315 320 

Asp Lys Trp Asp Arg Asp Glu Ala Glu Arg Leu Pro Glu Tyr Met Lys 
325 330 335 

He Val Tyr Arg Phe He Leu Ser He Tyr Glu Asn Tyr Glu Arg Asp 
340 345 350 

Ala Ala Lys Leu Gly Lys ser Phe Ala Ala Pro Tyr Phe Lys Glu Thr 
355 360 365 

Val Lys Gin Leu Ala Arg Ala Phe Asn Glu Glu Gin Lys Trp Val Met 
370 375 380 

Glu Arg Gin Leu Pro Ser Phe Gin Asp Tyr Val Lys Asn Ser Glu Lys 
385 390 395 400 

Thr Ser Cys He Tyr Thr Met Phe Ala Ser He He Pro Gly Leu Lys 
405 410 415 

Ser Val Thr Gin Glu Thr He Asp Trp He Lys Ser Glu Pro Thr Leu 
420 425 430 

Ala Thr Ser Thr Ala Met He Gly Arg Tyr Trp Asn Asp Thr Ser Ser 
435 440 445 

Gin Leu Arg Glu Ser Lys Gly Gly Glu Met Leu Thr Ala Leu Asp Phe 
450 455 460 

His Met Lys Glu Tyr Gly Leu Thr Lys Asp Glu Ala Ala Ser Lys Phe 
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465 470 475 480 

GJ.U Gly Jjeu Val Glu GXu Thr Trp Lys Asp lie Jk&n Lys Glu Phe He 
485 490 495 

Ala Ttir Thx Asn Tyr Asn Val Gly Arg Glu lie Ala He Thr Phe Leu 
500 SOS 510 

Asn Tyr Ala Arg He Cys Glu Ala Ser Tyr Ser Lys Thr Asp Gly Asp 
515 520 525 

Ala Tyr Ser Asp Pro Asn Val Ala Lys Ala Asn He Val Ala Leu Phe 
530 53S 540 

Val Asp Ala He Val Phe 
545 550 
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